Re: [Distutils] Entry points: specifying and caching

2017-10-19 Thread Nick Coghlan
On 20 October 2017 at 07:33, Donald Stufft  wrote:

>
> On Oct 19, 2017, at 5:26 PM, Tres Seaver  wrote:
>
> Having the packaging
> system register those services at installation time (even if it doesn't
> care otherwise about them) seems pretty reasonable to me.
>
>
> It does register them at installation time, using an entirely generic
> feature of “you can add any file you want to a dist-info directory and we
> will preserve it”. It doesn’t need to know anything else about them other
> then it’s a file that needs preserved.
>

That's all the *installer* needs to know. Publishing tools like flit need
to know the internal format in order to replicate the effect of
https://packaging.python.org/tutorials/distributing-packages/#console-scripts
and to interoperate with any other pkg_resources based plugin ecosystem.

I personally find it useful to think of entry points as a pub/sub
communications channel between package authors and other runtime components.

When you use the entry points syntax to declare a pytest plugin as a
publisher, your intended subscriber is pytest, and pytest defines the
possible messages. Ditto for any other entry points based plugin management
system.

Installers are mostly just a relay link in that pub/sub channel - they take
the entry point announcement messages in the sdist or wheel archive, and
pass them along to the installation database.

The one exception to the "installers as passive relay" behaviour is that
when you specify "console_scripts", your intended subscribers *are* package
installation tools, and your message is "I'd like an executable wrapper for
these entry points, please".

Right now, the only documented publishing API for that pub/sub channel is
setuptools.setup(), and the only documented subscription API is
pkg_resources. Documenting the file format explicitly changes that dynamic,
such that any publisher that produces a compliant `entry_points.txt` file
will be supported by pkg_resources, and any consumer that can read a
compliant `entry_points.txt` file will be supported by setuptools.setup()

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Entry points: specifying and caching

2017-10-19 Thread Nick Coghlan
On 20 October 2017 at 06:34, Donald Stufft  wrote:

>
> On Oct 19, 2017, at 4:04 PM, Donald Stufft  wrote:
>
> Like I said, I’m perfectly fine documenting that if you add an
> entry_points.txt to the .dist-info directory, that is an INI file that
> contains a section named “console_scripts” and define what is valid inside
> of the console_scripts section that it will generate script wrappers, then
> fine. But we should leave any other section in this entry_points.txt file
> as undefined in packaging terms, and point people towards setuptools for
> more information about it if they want to know anything more than what we
> need for packaging.
>
>
> To be more specific here, the hypothetical thing we would be
> documenting/standardizing here is console entry points and script wrappers,
> not a generic plugin system. So console scripts would be the focus of the
> documentation.
>

We've already effectively blessed console_scripts as a standard approach:
https://packaging.python.org/tutorials/distributing-packages/#entry-points

The specific problem that blessing creates is that we currently only define:

- a way for publishers to specify console_scripts via setuptools
- a way for installers to find console_scripts using pkg_resources

That's *very* similar to the problem we had with dependency declarations:
only setuptools knew how to write them, and only easy_install knew how to
read them.

Beyond the specific example of console_scripts, there are also multiple
subecosystems where both publishers and subscribers remain locked into the
setuptools/pkg_resources combination because they use entry points for
their plugin management. This means that if you want to write a pytest
plugin, for example, the only officially supported way to do so is to use
setuptools in order to publish the relevant entry point definitions:
https://docs.pytest.org/en/latest/writing_plugins.html#setuptools-entry-points

If we want to enable pytest plugin authors to use other build systems like
flit, then those build systems need a defined interoperability format
that's compatible with what pytest is expecting to see (i.e. entry point
definitions that pkg_resources knows how to read).

We ended up solving the previous tight publisher/installer coupling problem
for dependency management *not* by coming up with completely new metadata
formats, but rather by better specifying the ones that setuptools already
knew how to emit, such that most publishers didn't need to change anything,
and even when there were slight differences between the way setuptools
worked and the agreed interoperability standards, other tools could readily
translate setuptools output into the standardised form (e.g. egg_info ->
PEP 376 dist-info directories and wheel metadata).

The difference in this case is that:

1. entry_points.txt is already transported reliably through the whole
packaging toolchain
2. It is the existing interoperability format for `console_scripts`
definitions
3. Unlike setup.cfg & pyproject.toml, actual humans never touch it - it's
written and read solely by software

This means that the interoperability problems we actually care about
solving (allowing non-setuptools based publishing tools to specify
console_scripts and other pkg_resources entry points, and allowing
non-pkg_resources based consumers to read pkg_resources entry point
metadata, including console_scripts) can both be solved *just* by properly
specifying the existing de facto format.

So standardising on entry_points.txt isn't a matter of "because setuptools
does it", it's because formalising it is the least-effort solution to what
we actually want to enable: making setuptools optional on the publisher
side (even if you need to publish entry point metadata), and making
pkg_resources optional on the consumer side (even if you need to read entry
point metadata).

I do agree that the metadata caching problem is best tackled as a specific
motivating example for supporting packaging installation and uninstallation
hooks, but standardising the entry points format still helps us with that:
it means we can just define "python.install_hooks" as a new entry point
category, and spend our energy on defining the semantics and APIs of the
hooks themselves, rather than having to worry about defining a new format
for how publishers will declare how to run the hooks, or how installers
will find out which hooks have been installed locally.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Entry points: specifying and caching

2017-10-19 Thread Nick Coghlan
On 20 October 2017 at 02:14, Thomas Kluyver  wrote:

> On Thu, Oct 19, 2017, at 04:10 PM, Donald Stufft wrote:
> > I’m in favor, although one question I guess is whether it should be a a
> > PEP or an ad hoc spec. Given (2) it should *probably* be a a PEP (since
> > without (2), its just another file in the .dist-info directory and that
> > doesn’t actually need standardized at all). I don’t think that this will
> > be a very controversial PEP though, and should be pretty easy.
>
> I have opened a PR to document what is already there, without adding any
> new features. I think this is worth doing even if we don't change
> anything, since it's a de-facto standard used for different tools to
> interact.
>
> https://github.com/pypa/python-packaging-user-guide/pull/390
>
> We can still write a PEP for caching if necessary.
>

+1 for that approach (PR for the status quo, PEP for a shared metadata
caching design) from me

Making the status quo more discoverable is valuable in its own right, and
the only decisions we'll need to make for that are terminology
clarification ones, not interoperability ones (this isn't like PEP 440 or
508 where we actually thought some of the default setuptools behaviour was
slightly incorrect and wanted to change it).

Figuring out a robust cross-platform network-file-system-tolerant metadata
caching design on the other hand is going to be hard, and as Donald
suggests, the right ecosystem level solution might be to define
install-time hooks for package installation operations.


> > I’m also in favor of this. Although I would suggest SQLite rather than a
> > JSON file for the primary reason being that a JSON file isn’t
> > multiprocess safe without being careful (and possibly introducing
> > locking) whereas SQLite has already solved that problem.
>
> SQLite was actually my first thought, but from experience in Jupyter &
> IPython I'm wary of it - its built-in locking does not work well over
> NFS, and it's easy to corrupt the database. I think careful use of
> atomic writing can be more reliable (though that has given us some
> problems too).
>
> That may be easier if there's one cache per user, though - we can
> perhaps try to store it somewhere that's not NFS.
>

I'm wondering if rather than jumping straight to a PEP, it may make sense
to instead initially pursue this idea as a *non-*standard, implementation
dependent thing specific to the "entrypoints" project. There are a *lot* of
challenges to be taken into account for a truly universal metadata caching
design, and it would be easy to fall into the trap of coming up with a
design so complex that nobody can realistically implement it.

Specifically, I'm thinking of a usage model along the lines of the
updatedb/locate pair on *nix systems: `locate` gives you access to very
fast searches of your filesystem, but it *doesn't* try to automagically
keeps its indexes up to date. Instead, refreshing the indexes is handled by
`updatedb`, and you can either rely on that being run automatically in a
cron job, or else force an update with `sudo updatedb` when you want to use
`locate`.

For a project like entrypoints, what that might look like is that at
*runtime*, you may implement a reasonably fast "cache freshness check",
where you scanned the mtime of all the sys.path entries, and compared those
to the mtime of the cache. If the cache looks up to date, then cool,
otherwise emit a warning about the stale metadata cache, and then bypass it.

The entrypoints project itself could then expose a
`refresh-entrypoints-cache` command that could start out only supporting
virtual environments, and then extend to per-user caching, and then finally
(maybe) consider whether or not it wanted to support installation-wide
caches (with the extra permissions management and cross-process and
cross-system coordination that may imply).

Such an approach would also tie in nicely with Donald's suggestion of
reframing the ecosystem level question as "How should the entrypoints
project request that 'refresh-entrypoints-cache' be run after every package
installation or removal operation?", which in turn would integrate nicely
with things like RPM file triggers (where the system `pip` package could
set a file trigger that arranged for any properly registered Python package
installation plugins to be run for every modification to site-packages
while still appropriately managing the risk of running arbitrary code with
elevated privileges)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Entry points: specifying and caching

2017-10-19 Thread Wes Turner
On Thursday, October 19, 2017, Donald Stufft  wrote:

>
> On Oct 19, 2017, at 5:26 PM, Tres Seaver  > wrote:
>
> Having the packaging
> system register those services at installation time (even if it doesn't
> care otherwise about them) seems pretty reasonable to me.
>
>
> It does register them at installation time, using an entirely generic
> feature of “you can add any file you want to a dist-info directory and we
> will preserve it”. It doesn’t need to know anything else about them other
> then it’s a file that needs preserved.
>

When I think of 'register at installation time', I think of adding them to
a single { locked JSON || SQLite DB || ...}; because that's the only way
there'd be a performance advantage?

Why would we write a .txt, transform it to {JSON || SQL INSERTS}, and then
write it to a central registrar?

(BTW, pipsi does console script entry points with isolated virtualenvs
linked into from ~/.local/bin (which is generally user-writable)).
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Entry points: specifying and caching

2017-10-19 Thread Donald Stufft

> On Oct 19, 2017, at 5:26 PM, Tres Seaver  wrote:
> 
> Having the packaging
> system register those services at installation time (even if it doesn't
> care otherwise about them) seems pretty reasonable to me.

It does register them at installation time, using an entirely generic feature 
of “you can add any file you want to a dist-info directory and we will preserve 
it”. It doesn’t need to know anything else about them other then it’s a file 
that needs preserved.___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Entry points: specifying and caching

2017-10-19 Thread Tres Seaver
On 10/19/2017 04:57 PM, Donald Stufft wrote:

> Because the feature is unrelated to packaging other than the fact we
> currently utilize it for console_scripts.
That seems like an odd perspective.  Console scripts may be the only bit of
entry points which is used *by the packaging system* at installation time,
but an system composed of separately-installable packages providing shared
services needs some way of querying those services at runtime, which is
what all the *other* uses of entry points represent.  Having the packaging
system register those services at installation time (even if it doesn't
care otherwise about them) seems pretty reasonable to me.


Tres.
-- 
===
Tres Seaver  +1 540-429-0999  tsea...@palladion.com
Palladion Software   "Excellence by Design"http://palladion.com

___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Entry points: specifying and caching

2017-10-19 Thread Donald Stufft


> On Oct 19, 2017, at 4:36 PM, Thomas Kluyver  wrote:
> 
> On Thu, Oct 19, 2017, at 09:04 PM, Donald Stufft wrote:
>> Like I said, I’m perfectly fine documenting that if you add an
>> entry_points.txt to the .dist-info directory, that is an INI file that
>> contains a section named “console_scripts” and define what is valid
>> inside of the console_scripts section that it will generate script
>> wrappers, then fine. But we should leave any other section in this
>> entry_points.txt file as undefined in packaging terms, and point people
>> towards setuptools for more information about it if they want to know
>> anything more than what we need for packaging.
> 
> I don't see any advantage in describing the file format but then
> pretending that there's only section in it. We're not prescribing any
> particular meaning or use for other sections, but it seems bizarre to
> not describe the possibilities. console_scripts is just one use case.

Because the feature is unrelated to packaging other than the fact we currently 
utilize it for console_scripts. A spec to standardize console_scripts is a good 
thing, a spec to standardize an almost entirely unrelated feature for packaging 
is a bad thing. 

> 
> Also, entry points in general kind of are a packaging thing. You specify
> them in packaging metadata, both for setuptools and flit, and the
> packaging tools write entry_points.txt. It's not the only way to create
> a plugin system, but it's the way this one was created.

You can describe lots of things in the packaging metadata, because one of the 
features of the packaging metadata is you can add arbitrary files to the 
dist-info directory. Entrypoints are one such file that some projects add to 
that directory, but there are other examples and jsut becuause it involves 
adding files to that, does not mean it belongs to “packaging”.

> 
> I honestly don't get the resistance to documenting this as a whole. I'm
> not proposing something that will add a new maintenance burden; it's a
> description of something that's already there. Can't we save the energy
> for discussing a real change or new thing?
> 

I don’t get the resistance to documenting this where it belongs. Its not any 
more difficult to document things in the setuptools repository than it is to 
document it in the packaging specs repository.

___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Entry points: specifying and caching

2017-10-19 Thread Thomas Kluyver
On Thu, Oct 19, 2017, at 09:04 PM, Donald Stufft wrote:
> Like I said, I’m perfectly fine documenting that if you add an
> entry_points.txt to the .dist-info directory, that is an INI file that
> contains a section named “console_scripts” and define what is valid
> inside of the console_scripts section that it will generate script
> wrappers, then fine. But we should leave any other section in this
> entry_points.txt file as undefined in packaging terms, and point people
> towards setuptools for more information about it if they want to know
> anything more than what we need for packaging.

I don't see any advantage in describing the file format but then
pretending that there's only section in it. We're not prescribing any
particular meaning or use for other sections, but it seems bizarre to
not describe the possibilities. console_scripts is just one use case.

Also, entry points in general kind of are a packaging thing. You specify
them in packaging metadata, both for setuptools and flit, and the
packaging tools write entry_points.txt. It's not the only way to create
a plugin system, but it's the way this one was created.

I honestly don't get the resistance to documenting this as a whole. I'm
not proposing something that will add a new maintenance burden; it's a
description of something that's already there. Can't we save the energy
for discussing a real change or new thing?

Thomas
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Entry points: specifying and caching

2017-10-19 Thread Donald Stufft

> On Oct 19, 2017, at 4:04 PM, Donald Stufft  wrote:
> 
> Like I said, I’m perfectly fine documenting that if you add an 
> entry_points.txt to the .dist-info directory, that is an INI file that 
> contains a section named “console_scripts” and define what is valid inside of 
> the console_scripts section that it will generate script wrappers, then fine. 
> But we should leave any other section in this entry_points.txt file as 
> undefined in packaging terms, and point people towards setuptools for more 
> information about it if they want to know anything more than what we need for 
> packaging.

To be more specific here, the hypothetical thing we would be 
documenting/standardizing here is console entry points and script wrappers, not 
a generic plugin system. So console scripts would be the focus of the 
documentation.___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Entry points: specifying and caching

2017-10-19 Thread Donald Stufft


> On Oct 19, 2017, at 3:55 PM, Thomas Kluyver  wrote:
> 
> On Thu, Oct 19, 2017, at 08:29 PM, Donald Stufft wrote:
>> Because it is? A generic plugin mechanism is not a packaging feature any 
>> more then a HTTP client is a packaging feature, but setuptools contains one 
>> of those too. Since setuptools was in large part a packaging library, it 
>> will of course contain many packaging features that we’re going to 
>> standardize on, but something being in setuptools does not in fact make it a 
>> packaging feature in and of itself.
> 
> My argument is not that it's in setuptools, it's that
> 
> 1. It's already processed by multiple packaging tools
> 2. Any tool producing wheels which include command line tools basically has 
> to use entry points (or include a bunch of redundant complexity to make 
> command-line wrappers). It's a de-facto part of the wheel spec, at least 
> until a replacement is devised - and since it works, replacing for semantic 
> cleanliness is not a priority.
> 
> You're quite right that a plugin system doesn't need to be a packaging 
> standard. But that ship has sailed. It's already a standard format for 
> packaging, the only question is whether it's documented. Practicality beats 
> purity.


Like I said, I’m perfectly fine documenting that if you add an entry_points.txt 
to the .dist-info directory, that is an INI file that contains a section named 
“console_scripts” and define what is valid inside of the console_scripts 
section that it will generate script wrappers, then fine. But we should leave 
any other section in this entry_points.txt file as undefined in packaging 
terms, and point people towards setuptools for more information about it if 
they want to know anything more than what we need for packaging.

I am against fully speccing out or adding more features to entry points as part 
of a packaging standardization effort.
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Entry points: specifying and caching

2017-10-19 Thread Thomas Kluyver
On Thu, Oct 19, 2017, at 08:29 PM, Donald Stufft wrote:
> Because it is? A generic plugin mechanism is not a packaging feature
> any more then a HTTP client is a packaging feature, but setuptools
> contains one of those too. Since setuptools was in large part a
> packaging library, it will of course contain many packaging features
> that we’re going to standardize on, but something being in setuptools
> does not in fact make it a packaging feature in and of itself.
My argument is not that it's in setuptools, it's that

1. It's already processed by multiple packaging tools
2. Any tool producing wheels which include command line tools basically
   has to use entry points (or include a bunch of redundant complexity
   to make command-line wrappers). It's a de-facto part of the wheel
   spec, at least until a replacement is devised - and since it works,
   replacing for semantic cleanliness is not a priority.
You're quite right that a plugin system doesn't need to be a packaging
standard. But that ship has sailed. It's already a standard format for
packaging, the only question is whether it's documented. Practicality
beats purity.
Thomas
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Entry points: specifying and caching

2017-10-19 Thread Donald Stufft


> On Oct 19, 2017, at 3:15 PM, Thomas Kluyver  wrote:
> 
> On Thu, Oct 19, 2017, at 08:01 PM, Donald Stufft wrote:
>> 
>>> On Oct 19, 2017, at 2:54 PM, Thomas Kluyver >> > wrote:
>>> 
>>> I don't think this needs to be controversial. They are a de-facto
>>> packaging standard, whether or not that's theoretically necessary.
>>> There's more than one tool that can create them (setuptools, flit), and
>>> more than one that can consume them (pkg_resources, entrypoints). Lots
>>> of packages use them, and they're not going anywhere soon. Describing
>>> the format properly seems like a clear win.
>> 
>> 
>> 
>> I disagree they are a packaging standard and I think it would be crummy to 
>> define it as one. I believe it is a setuptools feature, that flit and 
>> entrypoints wants to integrate with a setuptools feature is fine, but that 
>> doesn’t make it a packaging standard just because it came from setuptools. I 
>> agree that describing the format properly is a clear win, but I believe it 
>> belongs in the setuptools documentation.
> 
> pip and distlib also independently read this format without going through 
> setuptools. It's a de-facto standard already.  Entry points are also the most 
> common way for packages to install command-line scripts, and the most 
> effective way to do so across different platforms. So it's essential that 
> install tools do understand this.

It’s only essential in that we support a very limited subset specifically for 
console scripts, which long term we should be extracting from entry points and 
using something dedicated to that. Generating script wrappers is a packaging 
concern, and if this proposal was about documenting the console_scripts key in 
an entry_points.txt file to trigger a console script being generated, then 
that’s fine with me.

> 
> Much of our packaging standards were built out of setuptools features anyway 
> - why pretend that this is different?

Because it is? A generic plugin mechanism is not a packaging feature any more 
then a HTTP client is a packaging feature, but setuptools contains one of those 
too. Since setuptools was in large part a packaging library, it will of course 
contain many packaging features that we’re going to standardize on, but 
something being in setuptools does not in fact make it a packaging feature in 
and of itself.

As an example of another setuptools feature that isn’t a packaging feature, I 
also would be against adding the resource APIs in a packaging standard because 
they’re not a packaging feature either, they’re a python import module feature 
(which is why Brett Cannon and Barry are adding them to importlib instead of 
trying to make a packaging PEP for them).

___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Entry points: specifying and caching

2017-10-19 Thread Thomas Kluyver
On Thu, Oct 19, 2017, at 08:01 PM, Donald Stufft wrote:
> 
>> On Oct 19, 2017, at 2:54 PM, Thomas Kluyver
>>  wrote:>> 
>> I don't think this needs to be controversial. They are a de-facto
>> packaging standard, whether or not that's theoretically necessary.
>> There's more than one tool that can create them (setuptools,
>> flit), and>> more than one that can consume them (pkg_resources,
>> entrypoints). Lots>> of packages use them, and they're not going anywhere 
>> soon. Describing>> the format properly seems like a clear win.
> 
> 
> I disagree they are a packaging standard and I think it would be
> crummy to define it as one. I believe it is a setuptools feature, that
> flit and entrypoints wants to integrate with a setuptools feature is
> fine, but that doesn’t make it a packaging standard just because it
> came from setuptools. I agree that describing the format properly is a
> clear win, but I believe it belongs in the setuptools documentation.
pip and distlib also independently read this format without going
through setuptools. It's a de-facto standard already.  Entry points are
also the most common way for packages to install command-line scripts,
and the most effective way to do so across different platforms. So it's
essential that install tools do understand this.
Much of our packaging standards were built out of setuptools features
anyway - why pretend that this is different?
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Entry points: specifying and caching

2017-10-19 Thread Donald Stufft

> On Oct 19, 2017, at 2:54 PM, Thomas Kluyver  wrote:
> 
> I don't think this needs to be controversial. They are a de-facto
> packaging standard, whether or not that's theoretically necessary.
> There's more than one tool that can create them (setuptools, flit), and
> more than one that can consume them (pkg_resources, entrypoints). Lots
> of packages use them, and they're not going anywhere soon. Describing
> the format properly seems like a clear win.


I disagree they are a packaging standard and I think it would be crummy to 
define it as one. I believe it is a setuptools feature, that flit and 
entrypoints wants to integrate with a setuptools feature is fine, but that 
doesn’t make it a packaging standard just because it came from setuptools. I 
agree that describing the format properly is a clear win, but I believe it 
belongs in the setuptools documentation.___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Entry points: specifying and caching

2017-10-19 Thread Thomas Kluyver
On Thu, Oct 19, 2017, at 07:09 PM, Donald Stufft wrote:
> So heres a different idea that is a bit more ambitious but that I think
> is a better overall idea. Let entrypoints be a setuptools thing, and lets
> define some key lifecycle hooks during the installation of a package and
> some mechanism in the metadata to let other tools subscribe to those
> hooks.

I'd like to document the existing mechanism as previously suggested. Not
least because I've already written the PR ;-).

I don't think this needs to be controversial. They are a de-facto
packaging standard, whether or not that's theoretically necessary.
There's more than one tool that can create them (setuptools, flit), and
more than one that can consume them (pkg_resources, entrypoints). Lots
of packages use them, and they're not going anywhere soon. Describing
the format properly seems like a clear win.

For caching, I'm happy enough to work on a more general PEP to define
packaging hooks, so long as that isn't going to be as long a discussion
as PEP 517.

Daniel:
> How long does pkg_resources take to import for you folks?

About 0.5s on my laptop with an SSD, about 5s on a machine with a
spinning hard drive. This is simulating a cold start on both; it's much
quicker once the OS caches it in memory.

Thomas
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Entry points: specifying and caching

2017-10-19 Thread Donald Stufft

> On Oct 19, 2017, at 2:28 PM, Paul Moore  wrote:
> 
> While I agree with this, one thing I have noticed with recent work is
> that standardising existing things has typically been relatively
> painless and stress-free. But designing new mechanisms generally ends
> up with huge threads, heated debates, and people burning out on the
> whole thing. We've had a couple of cases of that recently, and in
> particular Thomas has endured the big PEP 517 debate, so I'm inclined
> to say we should take a rest from new designs for a while, and keep
> the scope here limited.

So I’m generally fine with keeping the scope limited, but for the same reason 
as I think the real solution is what I defined above, I think this 
isn’t/shouldn’t be a packaging standard and is a setuptools feature and should 
be documented/live there. If setuptools wants to enable people to directly 
manipulate those files they can document the standard of those files, if they 
want to treat it as internal and you’re expected to use their APIs then they 
can.

Essentially, I don’t think that a plugin system should be within the domain of 
distutils-sig or the PyPA and the only reason we’re even thinking of it as one 
is because (a) historically setuptools _had_ a plugin system and (b) we lack 
lifecycle hooks. I’m loathe to move the documentation for a setuptools specific 
feature out of their documentation because I think it muddies the water further.___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Entry points: specifying and caching

2017-10-19 Thread Daniel Holth
I prefer a single more generic mechanism that packaging happens to use
instead of making special mechanisms for scripts or other callables that
packaging might some day be interested in. One API, I can type
pkg_resources.iter_entry_points('console_scripts') to enumerate the scripts
and perhaps invoke them without the wrappers, or I can look other plugins.

+1 on simply documenting what we have first.

How long does pkg_resources take to import for you folks?

On Thu, Oct 19, 2017 at 2:10 PM Donald Stufft  wrote:

>
>
> > On Oct 19, 2017, at 12:14 PM, Thomas Kluyver 
> wrote:
> >
> > On Thu, Oct 19, 2017, at 04:10 PM, Donald Stufft wrote:
> >> I’m in favor, although one question I guess is whether it should be a a
> >> PEP or an ad hoc spec. Given (2) it should *probably* be a a PEP (since
> >> without (2), its just another file in the .dist-info directory and that
> >> doesn’t actually need standardized at all). I don’t think that this will
> >> be a very controversial PEP though, and should be pretty easy.
> >
> > I have opened a PR to document what is already there, without adding any
> > new features. I think this is worth doing even if we don't change
> > anything, since it's a de-facto standard used for different tools to
> > interact.
> >
> > https://github.com/pypa/python-packaging-user-guide/pull/390
> >
> > We can still write a PEP for caching if necessary.
>
> I think documenting what’s there is a reasonable goal, but if we’re going
> to add caching we should just PEP the whole thing changing it from a defect
> standard to an actual standard + caching. Generally we should only use
> non-PEP “specs” in places where we’re just trying to document what exists
> already, but where we’re not really happy with the current solution or we
> plan to alter it eventually.
>
> For this, I think the entry points solution is generally a good one with
> some alterations (namely, the addition of caching)…. Although now that I
> think about it, maybe this isn’t really a packaging problem at all and I’m
> not sure that it benefits from standardization at all.
>
> So stepping back a second, here’s what entrypoints provides today:
>
> 1. A way to implement a interface that some other package can provide
> implementations for.
> 2. A way to specify script wrappers that will be automatically generated.
> 3. A way to define extras that must be installed in order for a particular
> entry point to be available.
>
> Off the bat I’m going to say we don’t need to worry about (2) in this
> hypothetical system, because I think the fact it is implemented currently
> via this system is mostly a historic accident, and it’s not something we
> should be looking at in the future. Script wrappers should have some
> dedicated metadata, not piggybacking off of the plugin system.
>
> For (3) I don’t believe that what extras were installed is recorded
> anywhere, so I’m going to guess that this works by looking up what extras
> are *available* for a particular package and then seeing if all of the
> requirements of that distribution are satisfied. Assuming that’s the case
> then that’s not really something that requires deep integration with the
> packaging toolchain, it just needs the APIs to look those things up.
>
> Finally we come to (1), which is in my opinion the meet of what you’re
> hoping to achieve here (and what most people are using entry points for
> outside of console scripts. What I notice about (1) is that it really has
> absolutely nothing to do with packaging at all. It would likely use some of
> the APIs provided by the packaging toolchain (for instance, the ability to
> add custom files to a .dist-info directory, the ability to iterate over
> installed packages, etc) but as a whole pip, setuptools, twine, PyPI, etc
> none of these things need to know anything about it.
>
> EXCEPT, for the fact that with the desire to cache things, it would be
> beneficial to “hook” into the lifecycle of a package install. However I
> know that there are other plugin systems out there that would like to also
> be able to do that (Twisted Plugins come to mind) and that I think outside
> of plugin systems, such a mechanism is likely to be useful in general for
> other cases.
>
> So heres a different idea that is a bit more ambitious but that I think is
> a better overall idea. Let entrypoints be a setuptools thing, and lets
> define some key lifecycle hooks during the installation of a package and
> some mechanism in the metadata to let other tools subscribe to those hooks.
> Then  a caching layer could be written for setuptools entrypoints to make
> that faster without requiring standardization, but also a whole new, better
> plugin system could to, Twisted plugins could benefit, etc [1].
>
> One thing that I like about all of our work recently in packaging is a lot
> of it has been about making it so there isn’t just one standard set of
> tools, and I think that providing lifecycle hooks is 

Re: [Distutils] Entry points: specifying and caching

2017-10-19 Thread Paul Moore
On 19 October 2017 at 19:09, Donald Stufft  wrote:
>
> So heres a different idea that is a bit more ambitious but that I think is a 
> better overall idea. Let entrypoints be a setuptools thing, and lets define 
> some key lifecycle hooks during the installation of a package and some 
> mechanism in the metadata to let other tools subscribe to those hooks. Then  
> a caching layer could be written for setuptools entrypoints to make that 
> faster without requiring standardization, but also a whole new, better plugin 
> system could to, Twisted plugins could benefit, etc [1].

I think this is a nice idea, and like you say could likely enable a
number of interesting use cases. However...

>
> One thing that I like about all of our work recently in packaging is a lot of 
> it has been about making it so there isn’t just one standard set of tools, 
> and I think that providing lifecycle hooks is another step along that path.

While I agree with this, one thing I have noticed with recent work is
that standardising existing things has typically been relatively
painless and stress-free. But designing new mechanisms generally ends
up with huge threads, heated debates, and people burning out on the
whole thing. We've had a couple of cases of that recently, and in
particular Thomas has endured the big PEP 517 debate, so I'm inclined
to say we should take a rest from new designs for a while, and keep
the scope here limited.

We can go back and hit packaging system hooks later, it's not like the
idea will go away. And the breathing space will also give people time
to actually implement the recent PEPs, and consolidate the gains we've
already made.

Paul
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Entry points: specifying and caching

2017-10-19 Thread Donald Stufft


> On Oct 19, 2017, at 12:14 PM, Thomas Kluyver  wrote:
> 
> On Thu, Oct 19, 2017, at 04:10 PM, Donald Stufft wrote:
>> I’m in favor, although one question I guess is whether it should be a a
>> PEP or an ad hoc spec. Given (2) it should *probably* be a a PEP (since
>> without (2), its just another file in the .dist-info directory and that
>> doesn’t actually need standardized at all). I don’t think that this will
>> be a very controversial PEP though, and should be pretty easy.
> 
> I have opened a PR to document what is already there, without adding any
> new features. I think this is worth doing even if we don't change
> anything, since it's a de-facto standard used for different tools to
> interact.
> 
> https://github.com/pypa/python-packaging-user-guide/pull/390
> 
> We can still write a PEP for caching if necessary.

I think documenting what’s there is a reasonable goal, but if we’re going to 
add caching we should just PEP the whole thing changing it from a defect 
standard to an actual standard + caching. Generally we should only use non-PEP 
“specs” in places where we’re just trying to document what exists already, but 
where we’re not really happy with the current solution or we plan to alter it 
eventually.

For this, I think the entry points solution is generally a good one with some 
alterations (namely, the addition of caching)…. Although now that I think about 
it, maybe this isn’t really a packaging problem at all and I’m not sure that it 
benefits from standardization at all.

So stepping back a second, here’s what entrypoints provides today:

1. A way to implement a interface that some other package can provide 
implementations for.
2. A way to specify script wrappers that will be automatically generated.
3. A way to define extras that must be installed in order for a particular 
entry point to be available.

Off the bat I’m going to say we don’t need to worry about (2) in this 
hypothetical system, because I think the fact it is implemented currently via 
this system is mostly a historic accident, and it’s not something we should be 
looking at in the future. Script wrappers should have some dedicated metadata, 
not piggybacking off of the plugin system.

For (3) I don’t believe that what extras were installed is recorded anywhere, 
so I’m going to guess that this works by looking up what extras are *available* 
for a particular package and then seeing if all of the requirements of that 
distribution are satisfied. Assuming that’s the case then that’s not really 
something that requires deep integration with the packaging toolchain, it just 
needs the APIs to look those things up.

Finally we come to (1), which is in my opinion the meet of what you’re hoping 
to achieve here (and what most people are using entry points for outside of 
console scripts. What I notice about (1) is that it really has absolutely 
nothing to do with packaging at all. It would likely use some of the APIs 
provided by the packaging toolchain (for instance, the ability to add custom 
files to a .dist-info directory, the ability to iterate over installed 
packages, etc) but as a whole pip, setuptools, twine, PyPI, etc none of these 
things need to know anything about it.

EXCEPT, for the fact that with the desire to cache things, it would be 
beneficial to “hook” into the lifecycle of a package install. However I know 
that there are other plugin systems out there that would like to also be able 
to do that (Twisted Plugins come to mind) and that I think outside of plugin 
systems, such a mechanism is likely to be useful in general for other cases.

So heres a different idea that is a bit more ambitious but that I think is a 
better overall idea. Let entrypoints be a setuptools thing, and lets define 
some key lifecycle hooks during the installation of a package and some 
mechanism in the metadata to let other tools subscribe to those hooks. Then  a 
caching layer could be written for setuptools entrypoints to make that faster 
without requiring standardization, but also a whole new, better plugin system 
could to, Twisted plugins could benefit, etc [1].

One thing that I like about all of our work recently in packaging is a lot of 
it has been about making it so there isn’t just one standard set of tools, and 
I think that providing lifecycle hooks is another step along that path.

> 
>> I’m also in favor of this. Although I would suggest SQLite rather than a
>> JSON file for the primary reason being that a JSON file isn’t
>> multiprocess safe without being careful (and possibly introducing
>> locking) whereas SQLite has already solved that problem.
> 
> SQLite was actually my first thought, but from experience in Jupyter &
> IPython I'm wary of it - its built-in locking does not work well over
> NFS, and it's easy to corrupt the database. I think careful use of
> atomic writing can be more reliable (though that has given us some
> problems too).
> 
> That may be easier if there's one 

Re: [Distutils] Entry points: specifying and caching

2017-10-19 Thread Thomas Kluyver
On Thu, Oct 19, 2017, at 04:10 PM, Donald Stufft wrote:
> I’m in favor, although one question I guess is whether it should be a a
> PEP or an ad hoc spec. Given (2) it should *probably* be a a PEP (since
> without (2), its just another file in the .dist-info directory and that
> doesn’t actually need standardized at all). I don’t think that this will
> be a very controversial PEP though, and should be pretty easy.

I have opened a PR to document what is already there, without adding any
new features. I think this is worth doing even if we don't change
anything, since it's a de-facto standard used for different tools to
interact.

https://github.com/pypa/python-packaging-user-guide/pull/390

We can still write a PEP for caching if necessary.

> I’m also in favor of this. Although I would suggest SQLite rather than a
> JSON file for the primary reason being that a JSON file isn’t
> multiprocess safe without being careful (and possibly introducing
> locking) whereas SQLite has already solved that problem.

SQLite was actually my first thought, but from experience in Jupyter &
IPython I'm wary of it - its built-in locking does not work well over
NFS, and it's easy to corrupt the database. I think careful use of
atomic writing can be more reliable (though that has given us some
problems too).

That may be easier if there's one cache per user, though - we can
perhaps try to store it somewhere that's not NFS.

Thomas
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Entry points: specifying and caching

2017-10-19 Thread Donald Stufft

> On Oct 18, 2017, at 10:52 AM, Thomas Kluyver  wrote:
> 
> 
> 1. Specification
> 


I’m in favor, although one question I guess is whether it should be a a PEP or 
an ad hoc spec. Given (2) it should *probably* be a a PEP (since without (2), 
its just another file in the .dist-info directory and that doesn’t actually 
need standardized at all). I don’t think that this will be a very controversial 
PEP though, and should be pretty easy.


> 
> 2. Caching


I’m also in favor of this. Although I would suggest SQLite rather than a JSON 
file for the primary reason being that a JSON file isn’t multiprocess safe 
without being careful (and possibly introducing locking) whereas SQLite has 
already solved that problem.

One possible further enhancement to your proposal is to try and think of a way 
to have a singular cache, since we can include the sys.path entry as part of 
the data inside the cache, having a singular cache means we can reduce the the 
number of files we have to open down to a single file. The biggest problem I 
see with this, is it opens up questions about how we handle things like user 
installs… so maybe a cache DB per sys.path entry is the best way. I think we 
could use something like SQLite’s ATTACH DATABASE command to add multiple DBs 
to the same SQLite connection to be able to query across all of the entries 
with a single query. One downside to this is that SQLite is an optional module 
in Python so it may not exist, although we could implement that so that we just 
bypass the cache always in that case (and probably raise a warning?) so things 
continue to work, they will just be slower.

I know that Twisted has used a cache file for awhile for plugins (so a similiar 
use case) so I wonder if they would have any opinions or insight into this as 
well.


___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Entry points: specifying and caching

2017-10-19 Thread Wes Turner
def get_env_json_path():
  directory = $VIRTUAL_ENV || ?
  return os.path.join(directory, ENV_JSON_FILENAME)

def on_install(pkg_json):
  env_json_path = get_env_json_path()
  env_json = json.load(env_json_path)
  env_json['pkgs’][pkgname] = pkg_json
  with open(env_json_path, 'w') as f:
f.write(env_json)

def read_cached_entry_points():
  env_json_path = get_env_json_path()
  env_json = json.load(env_json_path)
  entry_points = flatten(**{ pkg['entry_points'] for pkg in
env_json['pigs']})
  return entry_points


Would this introduce a need for a new and confusing rescan_metadata()
(pkg.on_install() for pkg in pkgs)?

On Wednesday, October 18, 2017, Nick Coghlan  wrote:

> On 19 October 2017 at 12:16, Daniel Holth  > wrote:
>
>> We said "you won't have to install setuptools" but actually "you don't
>> have to use it" is good enough. If you had 2 pkg-resources implementations
>> running you might wind up scanning sys.path extra times...
>>
> True, but that's where Thomas's suggestion of attempting to define a
> standardised caching convention comes in: right now, there's no middle
> ground between "you must use pkg_resources" and "every helper library must
> scan for the raw entry-point metadata itself".
>
> If there's a defined common caching mechanism, and support for it is added
> to new versions of pkg_resources, then the design constraint becomes "If
> you end up using multiple entry-point scanners, you'll want a recent
> setuptools/pkg_resource, so you don't waste too much time on repeated
> metadata scans".
>
> Cheers,
> Nick.
>
> --
> Nick Coghlan   |   ncogh...@gmail.com
>    |   Brisbane,
> Australia
>
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig