On 28.09.2014 23:59, Donald Stufft wrote:
> 
>> On Sep 28, 2014, at 5:36 PM, M.-A. Lemburg <m...@egenix.com 
>> <mailto:m...@egenix.com>> wrote:
>>
>> On 28.09.2014 21:31, Donald Stufft wrote:
>>> Hello All!
>>>
>>> I'd like to discuss the idea of moving PyPI to having immutable files. This
>>> would mean that once you publish a particular file you can never reupload 
>>> that
>>> file again with different contents. This would still allow deleting the 
>>> file or
>>> reuploading it if the checksums match what was there prior.
>>>
>>> This would be good for a few reasons:
>>>
>>> * It represents "best practices" for version numbers. Ideally if two people
>>>  have version "2.1" of a project, they'll have the same code, however as it
>>>  stands two people installing at two different times could have two very
>>>  different versions.
>>>
>>> * This will make improving the PyPI infrastructure easier, in particular it
>>>  will make it simpler to move away from using a glusterfs storage array and
>>>  switch to a redudant set of cloud object stores.
>>>
>>>
>>> In the past this was brought up and a few points were brought against it, 
>>> those
>>> were:
>>>
>>> 1. That authors could simply change files that were hosted on not PyPI 
>>> anyways
>>>   so it didn't really do much.
>>>
>>> 2. That it was too hard to test a release prior to uploading it due to the
>>>   nature of distutils requiring you to build the release in the same command
>>>   as the upload.
>>>
>>> With the fact that pip no longer hits external URLs by default, I believe 
>>> that
>>> the first item is no longer that large of a factor. People can do whatever 
>>> they
>>> want on external URLs of course, however if something is coming from PyPI as
>>> end users should now be aware of, they can know it is immutable.
>>>
>>> Now that there is twine, which allows uploading already created packages, I
>>> also believe that the second item is no longer a concern. People can easily
>>> create a distribution using ``setup.py sdist``, test it, and then upload 
>>> that
>>> exact thing they tested using ``twine upload <path to sdist>``.
>>
>> -1.
>>
>> It does happen that files need to be reuploaded because of a bug
>> in the release process and how people manage their code is really
>> *their* business, not that of PyPI.
> 
> Can you describe a reasonable hypothetical situation where this would occur
> often enough as to be something that is likely to happen on a consistent
> basis? Originally the problem was there was little ability to easily upload
> pre-created files so there was a reasonable chance that there may be a
> packaging bug that didn’t get exposed until you actually packaged + released.
>
> With the advent of twine though it’s now possible to test the exact bits that
> get uploaded to PyPI making that particular issue no longer a problem.
>
> However, the fact that the files are not immutable *do* cause a number of
> problems that need to be worked around in the mirroring infrastructure, the
> CDN, and for scaling PyPI out and removing the glusterfs component.

You are missing out on cases, where the release process causes files to
be omitted, human errors where packagers forget to apply changes to
e.g. documentation files, version files, change logs, etc., where
packagers want to add information that doesn't affect the software
itself, but meta information included in the distribution files.

Such changes often do not affect the software itself, and so are not
detected by software tests.

If I understand you correctly, you are essentially suggesting that it
becomes impossible to ever delete anything uploaded to PyPI, i.e.
turning PyPI into a WORM.

This would mean that package authors could never correct mistakes,
remove broken packages distribution files, ones which they may be
forced to remove for legal reasons, ones which they find are infected
with a virus or trojan, ones which they uploaded for fun or
by mistake.

This doesn't have anything to do with making the user experience
a better one. It is ignorant to assume that package authors who
sometimes delete distribution files, or at least want to have the
possibility to do so, don't care for their users. We are in
Python land, so most authors will know what they are doing and
do care for their users.

After all: Why do you think I'm arguing against this proposal ?
Because I want users of our packages to get the best experience
they can get, by downloading complete, correct and working
distribution files.

This whole idea also has another angle, namely a legal one:
the PSF doesn't own the distribution files it hosts on PyPI.

So far, the argument to not fix the much too broad license on PyPI
was that authors were able to delete files on PyPI to work around
the unneeded "irrevocable" part of that license.

With the suggested change, authors would have to give up complete
control over their distribution files to the PSF in order for their
packages to be installable by pip using its default settings.

This kind of lock-in and removal of author rights is not something
I can support as PSF director. Those authors are the ones that have
created a large part of our Python eco system and they are the ones that
have put in work to get Python to where it is now: one of the best
integrated programming languages you can find. We owe a lot to those
authors and need to care for them.

Finally, changes such as the above will result in more authors
to switch to alternative hosting platforms such as conda/binstar.org
or plain github clone + setup.py install (which is becoming increasingly
popular). Do you really believe that this will make the user experience
a better one in the long run ?

If we want to make it attractive for package authors to host their
packages on PyPI, we have to give them flexibility, respect their
rights and be welcoming.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 29 2014)
>>> Python Projects, Consulting and Support ...   http://www.egenix.com/
>>> mxODBC.Zope/Plone.Database.Adapter ...       http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2014-09-30: Python Meeting Duesseldorf ...                      tomorrow

::::: Try our mxODBC.Connect Python Database Interface for free ! ::::::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/
_______________________________________________
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig

Reply via email to