Re: [Distutils] Changing the "install hooks" mechanism for PEP 426

2013-08-16 Thread Vinay Sajip
PJ Eby  telecommunity.com> writes:

> I think you're thinking I'm describing a single level namespace; I'm
> referring to something like this:
> 
> {group1: {anykey: export_or_mapping}}

I got that :-)
 
> "anykey" is not validated by the spec, only by registered validators
> for "group1".  Of course it has to have some meaning that is
> interpretable by consumers of group1.  The point is that the *spec*
> only defines the syntax of group names and export strings, and it's
> left to specific groups to define the syntax/semantics of the keys.

Ok, I get what you mean now. But an appropriate validator would be one
defined by the definer of of the group, right? That code may not be present
on the system. Any validators registered by third parties could disagree;
what then? It may be better to say that either a consumer ask for an entry
against a specific key which it understands, or iterate over all keys and
ignore those they don't recognise the meaning of.
 
> I gave one example already: i18n/l10n information that's about files
> contained in the distribution's data.  It's quite possible to have
> distributions without any code, only data of that kind.  A requirement
> to create code, just to specify the data seems rather pointless.  In
> Nick's reply, he's listed other use cases.

Right, but your case here and the cases Nick mention belong, I think,
outside what I understand the scope of PEP 426 to be. That is to say,
metadata which describes elements of an installed distribution and
dependency information used when installing or uninstalling it. I don't see
any reason why we can't have an "extensions" subdict which is free-format in
the PEP 426 dict for holding other information not described further in the
spec, but that's just as a grab-bag for ancillary data.

I've probably been thinking at cross purposes when discussing the term
"extension". It's a somewhat overloaded term, what with C extensions and all :-)

> After this further discussion, I think that the use cases we're
> discussing really boil back down to exports vs. metadata extensions,
> and that maybe we should stick to them being separate.

I'm OK with that, but there is a lot of packaging metadata which properly
lives outside PEP 426, and which one wouldn't want to see ending up in the
metadata extensions just because they're there. For example, the information
conventionally passed in setuptools.setup(package_data=...), which is often
used to specify i18n/l10n data.

Regards,

Vinay Sajip

___
Distutils-SIG maillist  -  Distutils-SIG@python.org
http://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Changing the "install hooks" mechanism for PEP 426

2013-08-16 Thread PJ Eby
On Fri, Aug 16, 2013 at 8:04 AM, Nick Coghlan  wrote:
> Concrete extension use cases I have in mind that don't fit in the
> exports/entry-point data model:
>
> - the mapping of prebuilt executable names to wheel contents
> - platform specific external dependencies and other hints for conversion to
> platform specific formats (e.g. Windows GUIDs)
> - metadata for build purposes (e.g. the working directory to use when
> building from a VCS checkout, so you can have multiple projects in the same
> repo)
> - project specific metadata (e.g. who actually did the release)
> - security metadata (e.g. security contact address and email GPG
> fingerprint)
>
> This is why extensions/exports were originally separate, and may still
> remain that way.

But exports should still be used for hooks defined by the spec; they
are the Obvious Way to specify importable hooks, and *not* dogfooding
there is still a bad idea.

(To be clear, I was never exactly *enthused* about the idea of merging
extensions and exports, just *unenthused* about the idea of the spec
using extensions or custom metadata to do things that could be
perfectly well expressed as exports from a reserved namespace.)

I'm kind of toying with the idea that the core metadata itself could
be carved into extension namespaces, have the core itself be just
another extension, rather than nesting extensions and exports inside
the core, so that the entire spec is just a relatively-flat collection
of namespaces, in more human-digestible groups.

There are some conceptual and practical advantages to that, at least
in principle, but until I've played around with actually expressing
some concepts from the PEP that way, I won't know whether it would
actually pay off in practice.
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
http://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Changing the "install hooks" mechanism for PEP 426

2013-08-16 Thread PJ Eby
On Fri, Aug 16, 2013 at 6:21 AM, Vinay Sajip  wrote:
> PJ Eby  telecommunity.com> writes:
>
>> I guess I didn't explain it very well, because that's roughly what I
>> meant: a single namespace for all "extensions", structured as a
>> mapping from group names to submappings whose keys can be arbitrary
>> values, but whose values must then be either a string or a JSON
>> object, and if it's a string, then it should be an export specifier.
>
> Why should the keys be completely arbitrary?

By "arbitrary" I mean only that the PEP doesn't place syntactical
restrictions on them.


> I can't see a reason for this;
> the current constraint of the "prefixed name" seems sufficient. What would
> relaxing this constraint make possible which otherwise isn't?

I think you're thinking I'm describing a single level namespace; I'm
referring to something like this:

{group1: {anykey: export_or_mapping}}

"anykey" is not validated by the spec, only by registered validators
for "group1".  Of course it has to have some meaning that is
interpretable by consumers of group1.  The point is that the *spec*
only defines the syntax of group names and export strings, and it's
left to specific groups to define the syntax/semantics of the keys.



> On the values: an export specifier is just a more human-friendly version of
> a dict with module/content/extra keys. While of course the uses of
> importables in this area is well established, what specific use cases are we
> enabling by allowing arbitrary JSON? It certainly would clutter the metadata
> and render it less human-readable, and the only thing it provides is a dict
> which could be expressed in an importable form

I gave one example already: i18n/l10n information that's about files
contained in the distribution's data.  It's quite possible to have
distributions without any code, only data of that kind.  A requirement
to create code, just to specify the data seems rather pointless.  In
Nick's reply, he's listed other use cases.

The main question is, should exports and extensions be treated
separately?  Nick originally proposed merging the concepts and using
arbitrary JSON.  My counterproposal was to say, let's distinguish
exports and extensions by restricting the spec to something which
spells out the distinction.

After this further discussion, I think that the use cases we're
discussing really boil back down to exports vs. metadata extensions,
and that maybe we should stick to them being separate.
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
http://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Changing the "install hooks" mechanism for PEP 426

2013-08-16 Thread Nick Coghlan
Concrete extension use cases I have in mind that don't fit in the
exports/entry-point data model:

- the mapping of prebuilt executable names to wheel contents
- platform specific external dependencies and other hints for conversion to
platform specific formats (e.g. Windows GUIDs)
- metadata for build purposes (e.g. the working directory to use when
building from a VCS checkout, so you can have multiple projects in the same
repo)
- project specific metadata (e.g. who actually did the release)
- security metadata (e.g. security contact address and email GPG
fingerprint)

This is why extensions/exports were originally separate, and may still
remain that way.

Cheers,
Nick.
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
http://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Changing the "install hooks" mechanism for PEP 426

2013-08-16 Thread Vinay Sajip
PJ Eby  telecommunity.com> writes:

> I guess I didn't explain it very well, because that's roughly what I
> meant: a single namespace for all "extensions", structured as a
> mapping from group names to submappings whose keys can be arbitrary
> values, but whose values must then be either a string or a JSON
> object, and if it's a string, then it should be an export specifier.

Why should the keys be completely arbitrary? I can't see a reason for this;
the current constraint of the "prefixed name" seems sufficient. What would
relaxing this constraint make possible which otherwise isn't?

On the values: an export specifier is just a more human-friendly version of
a dict with module/content/extra keys. While of course the uses of
importables in this area is well established, what specific use cases are we
enabling by allowing arbitrary JSON? It certainly would clutter the metadata
and render it less human-readable, and the only thing it provides is a dict
which could be expressed in an importable form
"mypackage.mymodule:path.to.a.dict". Of course this incurs an import penalty
for access to the data, but if the publisher is concerned about this, they
can minimise that overhead by arranging their package hierarchy
appropriately. What kinds of use cases would require that the data be fully
expressed in the metadata to avoid import overhead?

> To put it another way, I'm saying something slightly stronger than a
> recommended convention: making it a requirement that strings at that
> level be import specifiers, and only allowing mappings as an
> alternative.

I'd like to understand the use cases which allowing mappings here would
facilitate, that need to avoid importing to access the mapping.

> If you already know what keys go in an entry point group, there's a
> good chance you're doing it wrong.  Normally, the whole point of the
> group is that the keys are defined by the publisher, not the consumer.
>  The normal pattern is that the consumer names the group (representing
> a hook), and the publishers name the extensions (representing
> implementations for the hook).

But the general form of the keys needs to be agreed to some extent between
the consumer and publisher. Otherwise, the consumer doesn't know how to
interpret the values of those keys. Of course a consumer can get all of the
key/value entries exported by a dist or all dists for a specific group, but
then what do they do with it if they don't know what the individual entries
mean?
 
> which means I'd rather not see the PEP make up its own data structures
> when they're not actually needed.

+1 - identification of the needs should come before specific proposals to
address them.

> Don't get me wrong, I'm okay with allowing JSON structures for
> extensions in place of export strings, but I don't think there's been
> a single use case proposed as yet that actually *works better* as a
> data structure.

The forms are equivalent, modulo an import penalty which only occurs for
actual use and not for just scanning to see what's available.

> Way to do it is something like a setuptools entry point -- i.e. a
> basic key-value pair in a consumer-defined namespace, mapping a
> semantically-valued name to an importable object.
> 
> And *most* use cases for extensions, that I'm aware of, fit that bill.
>  You have to be doing something pretty complex to need anything more
> complicated, *and* there has to be a possibility that you're going to
> avoid importing the related code or putting in on sys.path, or else
> you don't actually *save* anything by putting it in the metadata.
> IOW, if you're going to have to import it anyway, there is no point to
> putting it in the metadata; you might as well import it.

Agreed, so I would say that we need to identify these use cases before
saying that arbitrary mappings should be allowed as values.

Regards,

Vinay Sajip

___
Distutils-SIG maillist  -  Distutils-SIG@python.org
http://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Changing the "install hooks" mechanism for PEP 426

2013-08-15 Thread PJ Eby
On Thu, Aug 15, 2013 at 7:16 PM, Nick Coghlan  wrote:
> But if we're only going to validate it via hooks, why not have the "mapping
> of names to export specifiers" just be a recommended convention for
> extensions rather than a separate exports field?

I guess I didn't explain it very well, because that's roughly what I
meant: a single namespace for all "extensions", structured as a
mapping from group names to submappings whose keys can be arbitrary
values, but whose values must then be either a string or a JSON
object, and if it's a string, then it should be an export specifier.

To put it another way, I'm saying something slightly stronger than a
recommended convention: making it a requirement that strings at that
level be import specifiers, and only allowing mappings as an
alternative.  In that way, there is a minimum level of validation
possible for the majority of extensions *by default*, without needing
an explicit validator declared.

To put it another way, it ensures that there's a kind of lingua franca
or lowest-common denominator that lets somebody understand what's
going on in most extensions, without having to understand a new
*structural* schema for every extension group.  (Just a *syntactical*
one)


> As an extension, pydist.extension_hooks would also be non-conventional,
> since it would define a new namespace, where extension names map to an
> export group of hooks. A separate export group per hook would be utterly
> unreadable.

If you already know what keys go in an entry point group, there's a
good chance you're doing it wrong.  Normally, the whole point of the
group is that the keys are defined by the publisher, not the consumer.
 The normal pattern is that the consumer names the group (representing
a hook), and the publishers name the extensions (representing
implementations for the hook).

I don't see how it makes it unreadable, but then I think in terms of
the ini syntax or setup.py syntax for defining entry points, which is
all very flat.  IMO, creating a second-level data structure for this
doesn't make a whole lot of sense, because now you're nesting
something.

I'm not even clear why you need separate registrations for the
different hooks anyway; ISTM a single hook with an event parameter is
sufficient.  Even if it weren't, I'd be inclined to just make the
information part of the key in that case, e.g.

[pydist.extension_listeners]
preinstall:foo.bar = some.module:hook

This sort of thing is very flat and easy to express in a simple
configuration syntax, which we really shouldn't lose sight of.  It's
just as easy to have write a syntax validator as a structure
validator, but if you start with structures then you have to
back-figure a syntax.  I'd very much like it to be easy to define a
simple flat syntax that's usable for 90%+ of extension use cases...
which means I'd rather not see the PEP make up its own data structures
when they're not actually needed.

Don't get me wrong, I'm okay with allowing JSON structures for
extensions in place of export strings, but I don't think there's been
a single use case proposed as yet that actually *works better* as a
data structure.

If you need to do something like have a bunch of i18n/l10n resource
definitions with locales and subpaths and stuff like that...  awesome.
 That's something that might make a lot of sense for JSON.  But when
the ultimate point of the data structure is to define an importable
entry point, and the information needed to identify it can be put into
a relatively short human readable string, ISTM that the One Obvious
Way to do it is something like a setuptools entry point -- i.e. a
basic key-value pair in a consumer-defined namespace, mapping a
semantically-valued name to an importable object.

And *most* use cases for extensions, that I'm aware of, fit that bill.
 You have to be doing something pretty complex to need anything more
complicated, *and* there has to be a possibility that you're going to
avoid importing the related code or putting in on sys.path, or else
you don't actually *save* anything by putting it in the metadata.

IOW, if you're going to have to import it anyway, there is no point to
putting it in the metadata; you might as well import it.  The only
things that make sense to put in metadata for these things are data
that tells you whether or not you need to import it.  Generally, this
means keys, not values, in other words.  (Which is why l10n and
scripts make sense to not be entry points: at the time you use them,
you're not importing 'em.)


>That's why I'm still inclined to make this one a separate top
> level field: *installers* have to know how to bootstrap the hook system, and
> I like the symmetry of separate, relatively flat, publication and
> subscription interfaces.

I don't really see the value of a separate top-level field, but then
that's because I don't see anything at all special about these hooks
that demands something more sophisticated than common entry points.
AFAICT it's a YAGNI

Re: [Distutils] Changing the "install hooks" mechanism for PEP 426

2013-08-15 Thread Nick Coghlan
On 15 Aug 2013 12:27, "PJ Eby"  wrote:
>
> On Thu, Aug 15, 2013 at 12:36 PM, Vinay Sajip 
wrote:
> > PJ Eby  telecommunity.com> writes:
> >> than nested.)  So I would suggest that an export can either be an
> >> import identifier string, *or* a JSON object with arbitrary contents.
> > [snip]
> >> Given how many use cases are already met today by providing
> >> import-based exports, ISTM that they are the 20% that provides 80% of
> >> the value; arbitrary JSON is the 80% that only provides 20%, and so
> >> should not be the entry point (no pun intended) for people dealing
> >> with extensions.
> >
> > The above two statements seem to be contradictory as to the value of
> > arbitrary JSON.
>
> I don't see a contradiction.  I said that the majority of use cases
> (the figurative 80% of value) can be met with just a string (20% of
> complexity), and that a minority of use cases (20% of value) would be
> met by JSON (80% of complexity).
>
> This is consistent with STASCTAP, i.e., simple things are simple,
> complex things are possible.
>
> To be clear: I am *against* arbitrary JSON as the core protocol; it
> should be only for "complex things are possible" and only used when
> absolutely required.  I think we are in agreement on this.

But if we're only going to validate it via hooks, why not have the "mapping
of names to export specifiers" just be a recommended convention for
extensions rather than a separate exports field?

pydist.install_hooks, pydist.console_scripts, pydist.gui_scripts would then
all be conventional export groups.

pydist.prebuilt_commands would be non-conventional, since the values would
be relative file paths rather than export specifiers.

As an extension, pydist.extension_hooks would also be non-conventional,
since it would define a new namespace, where extension names map to an
export group of hooks. A separate export group per hook would be utterly
unreadable. That's why I'm still inclined to make this one a separate top
level field: *installers* have to know how to bootstrap the hook system,
and I like the symmetry of separate, relatively flat, publication and
subscription interfaces.

Cheers,
Nick.

>
>
> > I think the metadata format is a communication tool between
> > developers as much as anything else (though intended to be primarily
> > consumed by software), so I think KISS and YAGNI should be our
watch-words
> > (in terms of what the PEP allows), until specific uses have been
identified.
>
> +100.
>
>
> >> That would make it easier, I think, to implement both a full-featured
> >> replacement for setuptools entry point API, and allow simple
> >
> > What do you feel is missing in terms of functionality?
>
> What I was saying is that starting from a base of arbitrary JSON (as
> Nick seemed to be proposing) would make it *harder* to provide the
> simple functionality.  Not that adding JSON is needed to support
> setuptools functionality.  Setuptools does just fine with plain export
> strings!
>
> I don't want to lose that simplicity; the "export string or JSON"
> suggestion was a compromise counterproposal to Nick's "let's just use
> arbitrary JSON structures".
>
>
> > I think the thing here is to identify what the components in the build
> > system would be (as an abstraction), how they would interact etc. If we
look
> > at how the build side of distutils works, it's all pretty much hardcoded
> > once you specify the inputs, without doing a lot of work to subclass,
> > monkey-patch etc. all over the place. It's unnecessarily hard to do even
> > simple stuff like "use this set of compilation flags for only this
specific
> > set of sources in my extension". In any realistic build pipeline you'd
need
> > to be able to insert components into the pipeline, sometimes to augment
the
> > work of other components, sometimes to replace it etc. and ISTM we don't
> > really know how any of that would work (at a meta level, I mean).
>
> I was assuming that we leave build tools to build tool developers.  If
> somebody wants to create a pipelined or meta-tool system, then
> projects that want to use that can just say, "I use the foobar
> metabuild system".  For installer-tool purposes, it suffices to say
> what system will be responsible, and have a standard for how to invoke
> build systems and get wheels or the raw materials from which the wheel
> should be created.
>
> *How* this build system gets the raw materials and does the build is
> its own business.  It might use extensions, or it might be setup.py
> based, or Makefile based, or who knows whatever else.  That's none of
> the metadata PEP's business, really.  Just how to invoke the builder
> and get the outputs.
> ___
> Distutils-SIG maillist  -  Distutils-SIG@python.org
> http://mail.python.org/mailman/listinfo/distutils-sig
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
http://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Changing the "install hooks" mechanism for PEP 426

2013-08-15 Thread PJ Eby
On Thu, Aug 15, 2013 at 12:36 PM, Vinay Sajip  wrote:
> PJ Eby  telecommunity.com> writes:
>> than nested.)  So I would suggest that an export can either be an
>> import identifier string, *or* a JSON object with arbitrary contents.
> [snip]
>> Given how many use cases are already met today by providing
>> import-based exports, ISTM that they are the 20% that provides 80% of
>> the value; arbitrary JSON is the 80% that only provides 20%, and so
>> should not be the entry point (no pun intended) for people dealing
>> with extensions.
>
> The above two statements seem to be contradictory as to the value of
> arbitrary JSON.

I don't see a contradiction.  I said that the majority of use cases
(the figurative 80% of value) can be met with just a string (20% of
complexity), and that a minority of use cases (20% of value) would be
met by JSON (80% of complexity).

This is consistent with STASCTAP, i.e., simple things are simple,
complex things are possible.

To be clear: I am *against* arbitrary JSON as the core protocol; it
should be only for "complex things are possible" and only used when
absolutely required.  I think we are in agreement on this.


> I think the metadata format is a communication tool between
> developers as much as anything else (though intended to be primarily
> consumed by software), so I think KISS and YAGNI should be our watch-words
> (in terms of what the PEP allows), until specific uses have been identified.

+100.


>> That would make it easier, I think, to implement both a full-featured
>> replacement for setuptools entry point API, and allow simple
>
> What do you feel is missing in terms of functionality?

What I was saying is that starting from a base of arbitrary JSON (as
Nick seemed to be proposing) would make it *harder* to provide the
simple functionality.  Not that adding JSON is needed to support
setuptools functionality.  Setuptools does just fine with plain export
strings!

I don't want to lose that simplicity; the "export string or JSON"
suggestion was a compromise counterproposal to Nick's "let's just use
arbitrary JSON structures".


> I think the thing here is to identify what the components in the build
> system would be (as an abstraction), how they would interact etc. If we look
> at how the build side of distutils works, it's all pretty much hardcoded
> once you specify the inputs, without doing a lot of work to subclass,
> monkey-patch etc. all over the place. It's unnecessarily hard to do even
> simple stuff like "use this set of compilation flags for only this specific
> set of sources in my extension". In any realistic build pipeline you'd need
> to be able to insert components into the pipeline, sometimes to augment the
> work of other components, sometimes to replace it etc. and ISTM we don't
> really know how any of that would work (at a meta level, I mean).

I was assuming that we leave build tools to build tool developers.  If
somebody wants to create a pipelined or meta-tool system, then
projects that want to use that can just say, "I use the foobar
metabuild system".  For installer-tool purposes, it suffices to say
what system will be responsible, and have a standard for how to invoke
build systems and get wheels or the raw materials from which the wheel
should be created.

*How* this build system gets the raw materials and does the build is
its own business.  It might use extensions, or it might be setup.py
based, or Makefile based, or who knows whatever else.  That's none of
the metadata PEP's business, really.  Just how to invoke the builder
and get the outputs.
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
http://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Changing the "install hooks" mechanism for PEP 426

2013-08-15 Thread Vinay Sajip
PJ Eby  telecommunity.com> writes:

> I think that as part of the spec, we should either reserve multiple
> prefixes for Python/stdlib use, or have a single, always-reserved
> top-level prefix like 'py.' that can be subdivided in the future.

+1

There's quite a lot of stuff in your post that I haven't digested yet, but 
one thing confused me early on:

> than nested.)  So I would suggest that an export can either be an
> import identifier string, *or* a JSON object with arbitrary contents.
[snip]
> Given how many use cases are already met today by providing
> import-based exports, ISTM that they are the 20% that provides 80% of
> the value; arbitrary JSON is the 80% that only provides 20%, and so
> should not be the entry point (no pun intended) for people dealing
> with extensions.

The above two statements seem to be contradictory as to the value of 
arbitrary JSON. I think the metadata format is a communication tool between 
developers as much as anything else (though intended to be primarily 
consumed by software), so I think KISS and YAGNI should be our watch-words 
(in terms of what the PEP allows), until specific uses have been identified. 

> That would make it easier, I think, to implement both a full-featured
> replacement for setuptools entry point API, and allow simple

What do you feel is missing in terms of functionality?

> It's just extensions, IMO.  What else *is* there?  You *could* define
> a core metadata field that says, "this is the distribution I depend on

I think the thing here is to identify what the components in the build 
system would be (as an abstraction), how they would interact etc. If we look 
at how the build side of distutils works, it's all pretty much hardcoded 
once you specify the inputs, without doing a lot of work to subclass, 
monkey-patch etc. all over the place. It's unnecessarily hard to do even 
simple stuff like "use this set of compilation flags for only this specific 
set of sources in my extension". In any realistic build pipeline you'd need 
to be able to insert components into the pipeline, sometimes to augment the 
work of other components, sometimes to replace it etc. and ISTM we don't 
really know how any of that would work (at a meta level, I mean).

Regards,

Vinay Sajip

___
Distutils-SIG maillist  -  Distutils-SIG@python.org
http://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Changing the "install hooks" mechanism for PEP 426

2013-08-15 Thread Vinay Sajip
Nick Coghlan  gmail.com> writes:

> Sounds fair - let's use "pydist", since we want these definitions to be
> somewhat independent of their reference implementation in distlib :)

Seems reasonable.

> Based on PJE's feedback, I'm also starting to think that the
> exports/extensions split is artificial and we should drop it. Instead,
> there should be a "validate" export hook that build tools can call to
> check for export validity, and the contents of an export group be
> permitted to be arbitrary JSON.

I don't know that we should allow arbitrary JSON here: I would wait to see 
what it is we need, and keep it restricted for now until the more detailed 
understanding of those needs becomes more apparent. Arbitrary JSON is likely 
to be needed for *implementations* of things, but not necessarily for 
*interfaces* between things. The PEP 426 scope should be mainly focused on 
dependency resolution, other installer requirements and interactions between 
installed distributions (exports).

> The installers are still going to have to be export_hooks aware, though,
> since the registered handlers are how the whole export system will be
> bootstrapped. 

Distil currently supports the preuninstall/postinstall hooks, and I expect 
to extend this to other types of hook.

> Something else I'm wondering: should the metabuild system be separate,

I think it should be separate, though of course there will be a role for 
exports. The JSON metadata needed for source packaging and building can be 
quite large (example at [1]), and IMO doesn't really belong with the PEP 426 
metadata. Currently, the extended metadata used by distil for building 
contains the whole PEP 426 metadata as an "index-metadata" sub-dictionary. 
It's already a fairly generic build system - though simple, it can build 
e.g. C/C++/Fortran extensions, handle Cython, SWIG and so on, without using 
any of distutils. However, there's still lots of work to be done to 
generalise the interfaces between different parts of the system so that 
building can be plug and play - it's a bit opaque at the moment, but I 
expect that will improve.

Regards,

Vinay Sajip

[1] http://red-dove.com/pypi/projects/A/Assimulo/package-2.2.json

___
Distutils-SIG maillist  -  Distutils-SIG@python.org
http://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Changing the "install hooks" mechanism for PEP 426

2013-08-15 Thread PJ Eby
On Thu, Aug 15, 2013 at 9:21 AM, Nick Coghlan  wrote:
>
> On 15 Aug 2013 00:39, "Vinay Sajip"  wrote:
>>
>> PJ Eby  telecommunity.com> writes:
>>
>> > The build system *should* reserve at least one (subdivisible)
>> > namespace for itself, and use that mechanism for its own extension,
>>
>> +1 - dog-food :-)
>
> Sounds fair - let's use "pydist", since we want these definitions to be
> somewhat independent of their reference implementation in distlib :)

I think that as part of the spec, we should either reserve multiple
prefixes for Python/stdlib use, or have a single, always-reserved
top-level prefix like 'py.' that can be subdivided in the future.
Extensions are a honking great idea, so the stdlib will probably do
more of them in the future.  Likewise, future standards and
informational PEPs will likely document specific extension protocols
of general and specialized interest.  (Notice, for example, that
extensions could be used to publicize what database drivers are
installed and available on a system.)


> Based on PJE's feedback, I'm also starting to think that the
> exports/extensions split is artificial and we should drop it. Instead, there
> should be a "validate" export hook that build tools can call to check for
> export validity, and the contents of an export group be permitted to be
> arbitrary JSON.

I think there is still something to be said for STASCTAP: simple
things are simple, complex things are possible.  (Also, flat is better
than nested.)  So I would suggest that an export can either be an
import identifier string, *or* a JSON object with arbitrary contents.

That would make it easier, I think, to implement both a full-featured
replacement for setuptools entry point API, and allow simple
extensions to be simple.  It means, too, that simple exports can be
defined with a flatter syntax (ala setuptools' ini format) in tools
that generate the JSON.

Given how many use cases are already met today by providing
import-based exports, ISTM that they are the 20% that provides 80% of
the value; arbitrary JSON is the 80% that only provides 20%, and so
should not be the entry point (no pun intended) for people dealing
with extensions.

Removing the extension/export split also raises a somewhat different
question, which is what to *call* them.  I'm sort of leaning towards
"extensions" as the general category, with "exports" being extensions
that consist of an importable object, and "JSON extensions" for ones
that are a JSON mapping object.

So the terminology would be:

Extension group - package like names, subdivisible as a namespace,
should have a prefix associated with a project that defines the
semantics of the extension group; analagous to Eclipse's notion of an
"extension point"

Extension name - arbitrary string, unique per distribution for a given
group, but not required to be globally unique even for the group.
Specific names or specific syntax for names may be specified by the
creators of the group, and may optionally be validated.

Extension object - either an "export string" specifying an importable
object, or a JSON object.  If a string, must be syntactically valid as
an export; it is not, however, required to reference a module in the
distribution that exports it; it *should* be in that distribution or
one of its dependencies, however.

So, an extension is machine-usable metadata published by a
distribution in order to be (optionally) consumed by other
distributions.  It can be either static JSON metadata, or an
importable object.  The semantics of an extension are defined by its
group, and other extensions can be used to validate those semantics.
Any project that wants to be able to use plugins or extensions of some
kind, can define its own groups, and publish extensions for validating
them.  Python itself will reserve and define a group namespace for
extending the build and installation system, including a sub-namespace
where the validators can be declared.


> So we would have "pydist.commands" and "pydist.export_hooks" as export
> groups, with "distlib" used as an example of how to define handlers for
> them.

Is 'commands' for scripts, or something else?   Following "flat is
better than nested", I would suggest not using arbitrary JSON for
these when it's easy to define new dotted groups.  (Keeping to such a
style will make it easier for humans to define this stuff in the first
place, before it's turned into JSON.)

(Note, btw, that having more dots in a name does not necessarily equal
"nested", whereas replacing those dots with nested JSON structures
most definitely *is* "nested"!)

Similarly, I'd just as soon see e.g. pydist.hooks.* subgroups, rather
than a dedicated data structure.  A 'pydist.validators' group would of
course also be needed for syntax validation, with extension names in
that group possibly allowing trailing '*' or '**' wildcards.

(There will of course need to be a validation API, which is why I
think that a separate PEP for the "extensions" system is probably

Re: [Distutils] Changing the "install hooks" mechanism for PEP 426

2013-08-15 Thread Nick Coghlan
On 15 Aug 2013 00:39, "Vinay Sajip"  wrote:
>
> PJ Eby  telecommunity.com> writes:
>
> > The build system *should* reserve at least one (subdivisible)
> > namespace for itself, and use that mechanism for its own extension,
>
> +1 - dog-food :-)

Sounds fair - let's use "pydist", since we want these definitions to be
somewhat independent of their reference implementation in distlib :)

Based on PJE's feedback, I'm also starting to think that the
exports/extensions split is artificial and we should drop it. Instead,
there should be a "validate" export hook that build tools can call to check
for export validity, and the contents of an export group be permitted to be
arbitrary JSON.

So we would have "pydist.commands" and "pydist.export_hooks" as export
groups, with "distlib" used as an example of how to define handlers for
them.

The installers are still going to have to be export_hooks aware, though,
since the registered handlers are how the whole export system will be
bootstrapped.

Something else I'm wondering: should the metabuild system be separate, or
is it just some more export hooks and you define the appropriate export
group to say which build system to invoke? And rather than each installer
having to define their own fallback, we'd just implement the appropriate
hooks in setuptools to call setup.py. (Installers would still need an
explicit fallback for legacy metadata).

Cheers,
Nick.

>
> Regards,
>
> Vinay Sajip
>
> ___
> Distutils-SIG maillist  -  Distutils-SIG@python.org
> http://mail.python.org/mailman/listinfo/distutils-sig
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
http://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Changing the "install hooks" mechanism for PEP 426

2013-08-14 Thread Vinay Sajip
PJ Eby  telecommunity.com> writes:

> The build system *should* reserve at least one (subdivisible)
> namespace for itself, and use that mechanism for its own extension,

+1 - dog-food :-)

Regards,

Vinay Sajip

___
Distutils-SIG maillist  -  Distutils-SIG@python.org
http://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Changing the "install hooks" mechanism for PEP 426

2013-08-14 Thread PJ Eby
On Wed, Aug 14, 2013 at 3:14 PM, Nick Coghlan  wrote:
> On 14 August 2013 14:00, PJ Eby  wrote:
>> On Wed, Aug 14, 2013 at 11:36 AM, Nick Coghlan  wrote:
>>> * group - name of the export group to hook
>>> * preupdate - export to call prior to installing/updating/removing a
>>> distribution that exports this export group
>>> * postupdate - export to call after installing/updating/removing a
>>> distribution that exports this export group
>>> * refresh - export to call to resynchronise any caches with the system
>>> state. This will be invoked for every distribution on the system that
>>> exports this export group any time the distribution defining the
>>> export hook is itself installed or upgraded
>>
>> I've reread your post a few times and I'm not sure I understand it.
>> Let me try and spell out a scenario to see if I've got it:
>>
>> * Distribution A defines a refresh hook for group 'foo.bar' -- but
>> doesn't export anything in that group
>> * Distribution B defines an *export* (fka "entry point") -- any export
>> -- in export group 'foo.bar' -- but doesn't define any hooks
>> * Distribution A's refresh hook will be notified when B is installed,
>> updated, or removed
>
> No, A's preupdate and postupdate hooks would fire when B (or any other
> distro exporting the "foo.bar" group) is installed/updated/removed.
> refresh() would fire only when A was installed or updated.

Huh?  So refresh is only relevant to the package itself?  I guess I
don't understand the point of that, since you get the same info from
postupdate then, no?


> I realised that my proposed signature for the refresh() hook is wrong,
> though, since it doesn't deal with catching up on *removed*
> distributions properly. Rather than being called multiple times,
> refresh() instead needs to be called with an iterable providing the
> metadata for all currently installed distributions that export that
> group.

Ah.  But then why is it called for A, instead of..  oh, I think I see
now.  Gotcha.

This is the sort of thing that examples are really made for, so you
can see the use cases for the different hooks.

>> If so, my confusion is probably because of overloading of the term
>> "export" in this context; among other things, it's unclear whether
>> this is a separate data structure from exports themselves...  and if
>> so, why?
>
> Where "exports" is about publishing entries into an export group, the
> new "export_hooks" field would be about *subscribing* to an export
> group and being told about changes to it.

That's not actually a justification for not using exports.

> While you could use a naming convention to defined these hooks
> directly in "exports" without colliding with the export of the group
> itself, but I think it's better to separate them out so you can do
> stricter validation on the permitted keys and values (the rationale is
> similar to that for separating out commands from more general exports,
> and exports from arbitrary metadata extensions).

The separation of commands is (just barely) justifiable because it's
not a runtime use, it's installer use.

Stricter validation, OTOH, is a completely bogus justification for not
using exports, otherwise nobody would ever have any reason to use
exports, everybody would have to define their own extensions so they
could have stricter validation.  ;-)

The solution to providing more validation is to use *more* export groups, e.g.:

[mebs.export_validators]
mebs.refresh = module.that.validates.keys.in.the.refresh.group:somefunc

(In other words, define hooks for validating export groups, the way
setuptools uses an entry point group for validating setup keywords.)

Of course, even without that possibility, the stricter validation
concept is kind of bogus here: the only thing you can really validate
is that syntactically valid group names are being used as export
names, which isn't much of a validation.  You can't *semantically*
validate them, since there is no global registry of group names.  So
what's the point?

The build system *should* reserve at least one (subdivisible)
namespace for itself, and use that mechanism for its own extension,
for two reasons:

1. Entities should not be multiplied beyond necessity,
2. It serves as an example of how exports are to be used, and
3. The API is reusable...

No, three reasons!  Wait, I'll come in again...  the API is reusable,
it serves as an example, no duplication, and namespaces are a good
idea, let's do more of them...  no, four reasons...  chief amongst the
reasonry...

Seriously: I can *sort of* see a reason to keep commands separate, but
that's a "meh".  I admittedly just grabbed it as a handy way to
shoehorn that functionality into setuptools.

But keeping extensions to the build system itself in a separate place?
 No, a thousand times no.  This sort of extensibility is *precisely*
what the darn things are *for*.  If the build system doesn't use them,
what's the point?


> Mostly so you can validate them and display them differently

Re: [Distutils] Changing the "install hooks" mechanism for PEP 426

2013-08-14 Thread Nick Coghlan
On 14 August 2013 14:00, PJ Eby  wrote:
> On Wed, Aug 14, 2013 at 11:36 AM, Nick Coghlan  wrote:
>> * group - name of the export group to hook
>> * preupdate - export to call prior to installing/updating/removing a
>> distribution that exports this export group
>> * postupdate - export to call after installing/updating/removing a
>> distribution that exports this export group
>> * refresh - export to call to resynchronise any caches with the system
>> state. This will be invoked for every distribution on the system that
>> exports this export group any time the distribution defining the
>> export hook is itself installed or upgraded
>
> I've reread your post a few times and I'm not sure I understand it.
> Let me try and spell out a scenario to see if I've got it:
>
> * Distribution A defines a refresh hook for group 'foo.bar' -- but
> doesn't export anything in that group
> * Distribution B defines an *export* (fka "entry point") -- any export
> -- in export group 'foo.bar' -- but doesn't define any hooks
> * Distribution A's refresh hook will be notified when B is installed,
> updated, or removed

No, A's preupdate and postupdate hooks would fire when B (or any other
distro exporting the "foo.bar" group) is installed/updated/removed.
refresh() would fire only when A was installed or updated.

I realised that my proposed signature for the refresh() hook is wrong,
though, since it doesn't deal with catching up on *removed*
distributions properly. Rather than being called multiple times,
refresh() instead needs to be called with an iterable providing the
metadata for all currently installed distributions that export that
group.

> Is that what this is for?
>
> If so, my confusion is probably because of overloading of the term
> "export" in this context; among other things, it's unclear whether
> this is a separate data structure from exports themselves...  and if
> so, why?

Where "exports" is about publishing entries into an export group, the
new "export_hooks" field would be about *subscribing* to an export
group and being told about changes to it.

While you could use a naming convention to defined these hooks
directly in "exports" without colliding with the export of the group
itself, but I think it's better to separate them out so you can do
stricter validation on the permitted keys and values (the rationale is
similar to that for separating out commands from more general exports,
and exports from arbitrary metadata extensions).

You're right the name should be a key in a mapping (like "exports")
rather than a subfield in a list, though. That means I'm envisioning
something like the following:

Distribution "foo":

"export_hooks" : {
"foo.bar": {
"preupdate": "foo.bar:exporter_update_started"
"postupdate": "foo.bar:exporter_update_completed"
"refresh": "foo.bar:resync_cache"
}
}

When "foo" is installed or updated, then the installer would invoke
"foo.bar.resync_cache" with an iterable of all currently installed
distributions that export the "foo.bar" export group.

Distribution "notfoo":

"exports" : {
"foo.bar": {}
}

When "notfoo" is installed, updated or removed, then
"foo.bar:exporter_update_started" would be called prior to changing
anything, and "foo.bar:exporter_update_completed" would be called when
the changes had been made.

> If I were doing something like this in the existing entry point
> system, I'd do something like:
>
>   [mebs.refresh]
>   foo.bar = my.hook.module:refresh
>
> i.e., just list the hooks in an export group, using the export name to
> designate what export group is being monitored.  This approach
> leverages the fact that exports already need to be indexed, so why
> create a whole new sort of metadata just for the hooks?

Mostly so you can validate them and display them differently, and
avoid reserving any part of the shared namespace. I find documentation
is also easier when the core use cases aren't wedged into the
extension mechanisms (even if they share implementation details under
the hood).

> (But of course if I have misunderstood what you're trying to do in the
> first place, this and my other thoughts may be moot.)

I think you mostly understood it, I'm just not explaining it very well yet.

> (Oh, and btw, if a distribution has hooks for itself, then how are you
> going to invoke two different versions of the code?  Rollback
> sys.modules and reload?  Spawn another process?)

Requiring that all hook invocations happen in a subprocess sounds like
the best plan to me. The arguments all serialise nicely to JSON so
that shouldn't be too hard to arrange.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
http://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Changing the "install hooks" mechanism for PEP 426

2013-08-14 Thread PJ Eby
On Wed, Aug 14, 2013 at 11:36 AM, Nick Coghlan  wrote:
> * group - name of the export group to hook
> * preupdate - export to call prior to installing/updating/removing a
> distribution that exports this export group
> * postupdate - export to call after installing/updating/removing a
> distribution that exports this export group
> * refresh - export to call to resynchronise any caches with the system
> state. This will be invoked for every distribution on the system that
> exports this export group any time the distribution defining the
> export hook is itself installed or upgraded

I've reread your post a few times and I'm not sure I understand it.
Let me try and spell out a scenario to see if I've got it:

* Distribution A defines a refresh hook for group 'foo.bar' -- but
doesn't export anything in that group
* Distribution B defines an *export* (fka "entry point") -- any export
-- in export group 'foo.bar' -- but doesn't define any hooks
* Distribution A's refresh hook will be notified when B is installed,
updated, or removed

Is that what this is for?

If so, my confusion is probably because of overloading of the term
"export" in this context; among other things, it's unclear whether
this is a separate data structure from exports themselves...  and if
so, why?

If I were doing something like this in the existing entry point
system, I'd do something like:

  [mebs.refresh]
  foo.bar = my.hook.module:refresh

i.e., just list the hooks in an export group, using the export name to
designate what export group is being monitored.  This approach
leverages the fact that exports already need to be indexed, so why
create a whole new sort of metadata just for the hooks?

(But of course if I have misunderstood what you're trying to do in the
first place, this and my other thoughts may be moot.)

(Oh, and btw, if a distribution has hooks for itself, then how are you
going to invoke two different versions of the code?  Rollback
sys.modules and reload?  Spawn another process?)
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
http://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Changing the "install hooks" mechanism for PEP 426

2013-08-14 Thread Nick Coghlan
On 14 August 2013 11:55, Erik Bray  wrote:
> I think I'm okay with this so long as it remains optional.  I'm not
> crazy about executable build specs where they're not necessary.  For
> most cases, especially in pure Python packages, it's frequently
> overkill and asking for trouble.  So I would still want to see a
> well-accepted static build spec for Python packages too (sort of a la
> setup.cfg as parsed by d2to1, only better), though I realize that's a
> separate issue from PEP 426.

Sure, the main point of PEP 426 is to make it so the packaging
ecosystem doesn't need to *care* about the user facing formats. YAML,
ini, Python, doesn't matter :)

My current plan is to focus on formalising pydist.json as the main
vehicle for communicating between build tools and installers. I had
previously been thinking we could postpone defining the build system
hooks, but I now think it makes more sense to formalise that as well
before declaring metadata 2.0 ready for general use. In the meantime,
we'll continue getting by with setup.py and the setuptools metadata
formats :)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
http://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Changing the "install hooks" mechanism for PEP 426

2013-08-14 Thread Erik Bray
On Wed, Aug 14, 2013 at 11:36 AM, Nick Coghlan  wrote:
> I spent last weekend at "Flock to Fedora", mostly due to my day job
> working on the Beaker integration testing system
> (http://beaker-project.org) for Red Hat, but also to talk to folks
> about Fedora and Python interactions.
>
> A completely unexpected discovery over the weekend, was that some of
> the RPM folks are exploring the idea of switching the *user* facing
> format for the packaging system away from spec files and towards
> directly executable Python code. Thus, you'd get away from the painful
> mess that is RPM conditionals and macros and have a real programming
> language to define what your built packages *should* look like, while
> still *producing* static metadata for consumption by installers and
> other software distribution tools.
>
> Hmm, does that approach sound familiar to anyone? :)
>
> Anyway, we were talking about how they're considering approaching the
> install hook problem, and their approach gave me an idea for a better
> solution in PEP 426.
>
> Currently, PEP 426 allows a distribution to define "install hooks":
> hooks that will execute after the distribution is installed and before
> it is uninstalled.
>
> I'm now planning to change that to allowing distributions to define
> "export hooks", based on the cleaned up notion of "export groups" in
> the latest version of PEP 426. An export hook definition consists of
> the following fields:
>
> * group - name of the export group to hook
> * preupdate - export to call prior to installing/updating/removing a
> distribution that exports this export group
> * postupdate - export to call after installing/updating/removing a
> distribution that exports this export group
> * refresh - export to call to resynchronise any caches with the system
> state. This will be invoked for every distribution on the system that
> exports this export group any time the distribution defining the
> export hook is itself installed or upgraded
>
> If a distribution exports groups that it also defines hooks for, it
> will exhibit the following behaviours:
>
> Fresh install:
> * preupdate NOT called (hook not yet registered)
> * postupdate called
> * refresh called
>
> Upgrade:
> * preupdate called (old version)
> * postupdate called (new version)
> * refresh called (new version)
>
> Complete removal:
> * preupdate called
> * postupdate NOT called (hook no longer registered)
> * refresh NOT called (hook no longer registered)
>
> This behaviour follows naturally from *not* special casing
> self-exports: prior to installation, the export hooks won't be
> registered, so they won't be called, and the same applies following
> complete removal.
>
> The hooks would have the following signatures:
>
> def preupdate(current_meta, next_meta):
># current_meta==None indicates fresh install
># next_meta==None indicates complete removal
>
>def postupdate(previous_meta, current_meta):
># previous_meta==None indicates fresh install
># current_meta==None indicates complete removal
>
> def refresh(current_meta):
> # Used to ensure any caches are consistent with system state
> # Allows handling of previously installed distributions
>

I think I'm okay with this so long as it remains optional.  I'm not
crazy about executable build specs where they're not necessary.  For
most cases, especially in pure Python packages, it's frequently
overkill and asking for trouble.  So I would still want to see a
well-accepted static build spec for Python packages too (sort of a la
setup.cfg as parsed by d2to1, only better), though I realize that's a
separate issue from PEP 426.

Erik
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
http://mail.python.org/mailman/listinfo/distutils-sig


[Distutils] Changing the "install hooks" mechanism for PEP 426

2013-08-14 Thread Nick Coghlan
I spent last weekend at "Flock to Fedora", mostly due to my day job
working on the Beaker integration testing system
(http://beaker-project.org) for Red Hat, but also to talk to folks
about Fedora and Python interactions.

A completely unexpected discovery over the weekend, was that some of
the RPM folks are exploring the idea of switching the *user* facing
format for the packaging system away from spec files and towards
directly executable Python code. Thus, you'd get away from the painful
mess that is RPM conditionals and macros and have a real programming
language to define what your built packages *should* look like, while
still *producing* static metadata for consumption by installers and
other software distribution tools.

Hmm, does that approach sound familiar to anyone? :)

Anyway, we were talking about how they're considering approaching the
install hook problem, and their approach gave me an idea for a better
solution in PEP 426.

Currently, PEP 426 allows a distribution to define "install hooks":
hooks that will execute after the distribution is installed and before
it is uninstalled.

I'm now planning to change that to allowing distributions to define
"export hooks", based on the cleaned up notion of "export groups" in
the latest version of PEP 426. An export hook definition consists of
the following fields:

* group - name of the export group to hook
* preupdate - export to call prior to installing/updating/removing a
distribution that exports this export group
* postupdate - export to call after installing/updating/removing a
distribution that exports this export group
* refresh - export to call to resynchronise any caches with the system
state. This will be invoked for every distribution on the system that
exports this export group any time the distribution defining the
export hook is itself installed or upgraded

If a distribution exports groups that it also defines hooks for, it
will exhibit the following behaviours:

Fresh install:
* preupdate NOT called (hook not yet registered)
* postupdate called
* refresh called

Upgrade:
* preupdate called (old version)
* postupdate called (new version)
* refresh called (new version)

Complete removal:
* preupdate called
* postupdate NOT called (hook no longer registered)
* refresh NOT called (hook no longer registered)

This behaviour follows naturally from *not* special casing
self-exports: prior to installation, the export hooks won't be
registered, so they won't be called, and the same applies following
complete removal.

The hooks would have the following signatures:

def preupdate(current_meta, next_meta):
   # current_meta==None indicates fresh install
   # next_meta==None indicates complete removal

   def postupdate(previous_meta, current_meta):
   # previous_meta==None indicates fresh install
   # current_meta==None indicates complete removal

def refresh(current_meta):
# Used to ensure any caches are consistent with system state
# Allows handling of previously installed distributions

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
http://mail.python.org/mailman/listinfo/distutils-sig