[Good replies from Donald, Paul, et al already, but rather than replying to individual points, I figure it's best to just respond to Chris's original question with my own thoughts]
On 23 July 2016 at 01:47, Chris Barker - NOAA Federal <[email protected]> wrote: > Right now, the barrier to entry to putting a package up on PyPI is > very, very low. There are a lot of 'toy', or 'experimental', or 'well > intentioned but abandoned' projects on there. > > If there was a clear and obvious place for folks to put these packages > while they work out the kinks, and determine whether the package is > really going to fly, I think people would use it. That place is PyPI. Having a separate "maybe good, maybe bad" location for experimental packages (which is the way a lot of people use GitHub repos these days, relying on direct-from-VCS installs) leads to a persistent problem where folks later decide "I want to publish this officially", go to claim the name on PyPI as well and find they have a conflict. As further examples of similar "multiple authoritative namespaces" problems, we sometimes see folks creating Python projects specifically for Linux distros rather than upstream Python causing name collisions when the upstream project is later packaged for that distro (e.g. python-mock conflicting with Fedora's RPM "mock" build tool - resolved by the Fedora library being renamed to "mockbuild" while keeping "mock" as the CLI name), and we also see cross-ecosystem conflicts (e.g. python-pip and perl-pip conflicting on the "pip" CLI name, resolved by the good graces of the perl-pip maintainer in ceding the unqualified name to the Python package). You can also look at the number of semantically versioned packages that enjoy huge adoption even before the author(s) pull the trigger on a 1.0 release (e.g. requests, SQL Alchemy, the TOML spec off the top of my head), which reveals that package authors often have higher standards for "good enough for 1.0" than their prospective users do (the standard for users is generally "it's good enough to solve my current problem", while the standard for maintainers is more likely to be "it isn't a hellish nightmare to maintain as people start finding more corner cases that didn't previously occur to me/us"). The other instinctive answer is "namespaces solve this!", but the truth is they don't, as: 1. Namespaces tend to belong to organisations, and particularly for utility projects, "this utility happened to be developed by this organisation" is entirely arbitrary and mostly irrelevant (except insofar as if you trust a particular org you may trust their libraries more, but you can get that from the metadata). 2. If you want to build a genuinely inclusive open source project, branding it with the name of your company is one of the *worst* things you can do (since it prevents any chance of a feeling of shared ownership by the entire community) 3. Python already allows distribution package names to differ from import package names, as well as supporting namespace packages, which means folks *could* have adopted namespaces-by-convention if they were a genuinely compelling solution. That hasn't happened, which suggests there's an inherent user experience problem with the idea. Would requests be more discoverable if Kenneth had called it "kreitz-requests" instead? What if we had "org-pocoo-flask" instead of just plain "flask"? Or "ljworld-django"? How many folks haven't even looked at "zc.interface" because the association with Zope prompts them to dismiss it out of hand? Organisational namespaces on a service like GitHub are *absolutely* useful, but what they're replacing is the model where different organisations run their own version control server (just as different organisations may run their own private Python index today), rather than being a good thing to expose directly to end users that just want to locate and use a piece of software. > However, in these discussions, I've observed a common theme: folks in > the community bring up issues about unmaintained packages, namespace > pollution, etc. the core PyPA folks respond with generally well > reasoned arguments why proposed solutions won't fly. > > But it's totally unclear to me whether the core devs don't think these > are problems worth addressing, or think they can only be addresses > with major effort that no one has time for. If we accept my premise that "single global namespace, flat except by convention" really does offer the most attractive overall user experience for a software distribution ecosystem, what's missing from the status quo on PyPI? In a word? Gardening. Post-publication curation like that provided by Linux distros and other redistributor communities (including conda-forge) can help with the "What's worth my time and attention?" question (we can think of this as filtering the output of an orchard, and only passing along the best fruit), but it can't address the problem of undesirable name collisions and other problems on the main index itself (we can think of this as actually weeding the original orchard, and pruning the trees when necessary) However, we don't have anyone that's specifically responsible for looking after the shared orchard that is PyPI, and this isn't something we can reasonably ask volunteers to do, as it's an intensely political and emotional draining task where your primary responsibility is deciding if and when it's appropriate to *take people's toys away*. As an added bonus, becoming more active in content curation as a platform provider means potentially opening yourself up to lawsuits as well, if folks object to either your reclamation of a name they previously controlled, or else your refusal to reclaim a particular name for *their* purposes. So while first-come-first-served namespace management definitely doesn't provide the best possible user experience for Pythonistas, it *does* minimise the volume of ongoing namespace maintenance work required, as well as the PSF's exposure to legal liability as the platform provider. These concerns aren't something that "policy enforcement can be automated" really addresses either, as even if you have algorithmic enforcement, those "The robots have decided to take your project away from you" emails are still going to be going to real people, and those folks may still be understandably upset. "Our algorithms did it, not our staff" is also a pretty thin legal reed to pin your hopes on if somebody turns out to be upset enough to sue you about it. This means the entire situation changes if the PSF's Packaging Working Group receives sufficient funding (either directly through https://donate.pypi.io or through general PSF sponsorships) to staff at least one full-time "PyPI Publisher Support" role (in addition to the other points on the WG's TODO list), as well as to pay for analyses of the legal implications of having more formal content curation policies. However, in the absence of such ongoing funding support, the current laissez-faire policies will necessarily remain in place, as they're the only affordable option. Regards, Nick. P.S. If folks work at end user organisations for whom contributing something like $10k a year to the PSF for a Silver sponsorship (https://www.python.org/psf/sponsorship/ ) would be a rounding error in the annual budget want to see change in this kind of area, then advocating within your organisation to become PSF sponsor members on the basis of "We want to enable the PSF to work on community infrastructure improvement activities it's currently not pursuing for lack of resources" is probably one of *the most helpful things you can do* for the wider Python community. As individuals, we can at most sustainably contribute a few hours a week as volunteers, or maybe wrangle a full-time upstream contribution role if we're lucky enough to have or find a supportive employer. By contrast, the PSF is in a position to look for ways to let volunteers focus their time and energy on activities that are more inherently rewarding, by directing funding towards the many necessary-but-draining activities that go into supporting a collaborative software development community. The more resources the PSF has available, the more time and energy it will be able to devote towards identifying and addressing unwanted sources of friction in the collaborative process. -- Nick Coghlan | [email protected] | Brisbane, Australia _______________________________________________ Distutils-SIG maillist - [email protected] https://mail.python.org/mailman/listinfo/distutils-sig
