Re: PyPI and debian/watch
On Feb 04, 2015, at 10:53 AM, Donald Stufft wrote: That same page also mentions that qa.debian.org runs a number of redirectors for sites like SourceForge and GitHub so perhaps a better answer is for Debian QA to run a redirector for PyPI instead of PyPI implementing a redundant API endpoint with a slightly different layout and HTML primarily for Debian. +1 Cheers, -Barry pgp1m9Nwp58dg.pgp Description: OpenPGP digital signature
Re: PyPI and debian/watch
Tristan Seligmann mithra...@mithrandi.net writes: The debian/watch file I wrote for python-nacl (which also verifies the PGP signature) seems to work. I can't get PGP signature retrieval to rowk (“uscan warning: pgpsigurlmangle option exists, but the upstream keyring does not exist”) even with your suggested pattern. But I have also written a working uscan configuration:: opts=filenamemangle=s/\S+\/([^\/]+\.tar\.gz)#md5=[[:alnum:]]+$/$1/ \ https://pypi.python.org/simple/python-daemon/ \ \S+/python-daemon-(\S+)\.tar\.gz#md5=[[:alnum:]]+ \ debian Barry Warsaw ba...@debian.org writes: I'd love to be able to have something as simple as: version=3 https://pypi.python.org/simple/mypkg/mypkg-(.*).tar.gz which is close to what most packages probably use today, modulo the base url path. That would be great. But remember that the uscan documentation recommends a tighter matching pattern, so that would be:: version=3 https://pypi.python.org/simple/mypkg/mypkg-(.+).tar.gz I filed a bug against pypa/warehouse so hopefully we can get something better before Jessie is released (which is when I think there will be more pressure for a better solution, since most packages won't be updated during the freeze). https://github.com/pypa/warehouse/issues/358 Thanks very much! I'm not a fan of having it live at “…/uscan/” though. This is not specific to Debian, it's a sensible API design for all. -- \ “Probably the earliest flyswatters were nothing more than some | `\sort of striking surface attached to the end of a long stick.” | _o__) —Jack Handey | Ben Finney -- To UNSUBSCRIBE, email to debian-python-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/85h9v2gkz5@benfinney.id.au
Re: PyPI and debian/watch
On 4 February 2015 at 10:05, Ben Finney ben+deb...@benfinney.id.au wrote: Tristan Seligmann mithra...@mithrandi.net writes: The debian/watch file I wrote for python-nacl (which also verifies the PGP signature) seems to work. I can't get PGP signature retrieval to rowk (“uscan warning: pgpsigurlmangle option exists, but the upstream keyring does not exist”) even with your suggested pattern. You need a debian/upstream/signing-key.asc with the ASCII-armored PGP keys which will be accepted for signing the release. -- mithrandi, i Ainil en-Balandor, a faer Ambar -- To UNSUBSCRIBE, email to debian-python-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/CAMcKhMRseQ1C-kz=ofe9s+9rdfxokrwcfp1pb0qgpsjkagm...@mail.gmail.com
Re: PyPI and debian/watch
On Feb 4, 2015, at 11:20 AM, Barry Warsaw ba...@debian.org wrote: On Feb 04, 2015, at 10:53 AM, Donald Stufft wrote: That same page also mentions that qa.debian.org runs a number of redirectors for sites like SourceForge and GitHub so perhaps a better answer is for Debian QA to run a redirector for PyPI instead of PyPI implementing a redundant API endpoint with a slightly different layout and HTML primarily for Debian. +1 Cheers, -Barry Dunno the best way to give this to y'all but I wrote a thing: https://github.com/dstufft/pypi-debian I can transfer it on github or release it on PyPI or whatever. It shouldn't really need any maintenance or anything but I'm probably not going to pay attention to it if it does need any so someone else might want to actually own it. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA -- To UNSUBSCRIBE, email to debian-python-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/363795e0-f96f-423f-b6a6-d3f3fbf76...@stufft.io
Re: PyPI and debian/watch
On 4 February 2015 at 06:08, Donald Stufft don...@stufft.io wrote: If it gets implemented it'll live at /uscan/ because it exists primarily to work around the deficiencies that exist in uscan (Particularly the dificulty in ignoring url fragments). This seems like we're building a workaround to a tool we could theoretically change. :( debian/watch has a version=3, which is presumably so that there can be a version=4 when deficiencies are discovered -- wouldn't it be worthwhile to consider revbumping the watch format and updating uscan to have some improved support for edge cases like this? I know uscan has some other open bugs too that could use some thought towards a more flexible format to handle cases like this. ♥, - Tianon 4096R / B42F 6819 007F 00F8 8E36 4FD4 036A 9C25 BF35 7DD4 -- To UNSUBSCRIBE, email to debian-python-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/cahnknk2j1huhddu49nqkzag9vf5opcsj49yuulnuterm3jg...@mail.gmail.com
Re: PyPI and debian/watch
On Feb 4, 2015, at 3:02 PM, Tianon Gravi admwig...@gmail.com wrote: On 4 February 2015 at 06:08, Donald Stufft don...@stufft.io wrote: If it gets implemented it'll live at /uscan/ because it exists primarily to work around the deficiencies that exist in uscan (Particularly the dificulty in ignoring url fragments). This seems like we're building a workaround to a tool we could theoretically change. :( debian/watch has a version=3, which is presumably so that there can be a version=4 when deficiencies are discovered -- wouldn't it be worthwhile to consider revbumping the watch format and updating uscan to have some improved support for edge cases like this? I know uscan has some other open bugs too that could use some thought towards a more flexible format to handle cases like this. We talked about this in #debian-python and there was concern that a new version of uscan wouldn’t be in Jessie and then wouldn’t cover the people who need it the most. I don’t know if that’s true or not but I certainly think that uscan _should_ ignore anything that comes after a # (similarly to how it ignores anything that comes after a ?). That would solve the largest problem, that the URL fragment is hard to remove from the d/watch file. The other problem is that /simple/whatever/ has files located at /packages/stuff but I believe that’s not very hard to work around. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA -- To UNSUBSCRIBE, email to debian-python-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/bc9910d4-1c34-4856-9f0f-b46a99da1...@stufft.io
Re: PyPI and debian/watch
On 4 February 2015 at 13:06, Donald Stufft don...@stufft.io wrote: We talked about this in #debian-python and there was concern that a new version of uscan wouldn’t be in Jessie and then wouldn’t cover the people who need it the most. Ah right, that makes sense -- I forgot that we were talking about this specifically in the context of getting some kind of fix into Jessie. ♥, - Tianon 4096R / B42F 6819 007F 00F8 8E36 4FD4 036A 9C25 BF35 7DD4 -- To UNSUBSCRIBE, email to debian-python-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/cahnknk3ccon84fffzhxdkgpsneym7dikaonwkorjnqzzuxa...@mail.gmail.com
Re: PyPI and debian/watch
Hi Donald (2015.02.04_22:06:25_+0200) On 4 February 2015 at 06:08, Donald Stufft don...@stufft.io wrote: If it gets implemented it'll live at /uscan/ because it exists primarily to work around the deficiencies that exist in uscan (Particularly the dificulty in ignoring url fragments). Would it be that hard to have fake directory listings on /simple/? I mean, surely keeping compatibility there is simpler than having a second endpoint just for Debian. We talked about this in #debian-python and there was concern that a new version of uscan wouldn’t be in Jessie and then wouldn’t cover the people who need it the most. Who needs it the most? We could fix it in unstable and backports. The DEHS data on tracker.debian.org comes from quantz.debian.org. which is currently using devscripts from backports. I don’t know if that’s true or not but I certainly think that uscan _should_ ignore anything that comes after a # (similarly to how it ignores anything that comes after a ?). Agreed. SR -- Stefano Rivera http://tumbleweed.org.za/ +1 415 683 3272 -- To UNSUBSCRIBE, email to debian-python-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/20150204213208.ga3...@bach.rivera.co.za
Re: PyPI and debian/watch
On Feb 4, 2015, at 4:32 PM, Stefano Rivera stefa...@debian.org wrote: Hi Donald (2015.02.04_22:06:25_+0200) On 4 February 2015 at 06:08, Donald Stufft don...@stufft.io wrote: If it gets implemented it'll live at /uscan/ because it exists primarily to work around the deficiencies that exist in uscan (Particularly the dificulty in ignoring url fragments). Would it be that hard to have fake directory listings on /simple/? I mean, surely keeping compatibility there is simpler than having a second endpoint just for Debian. All the data that uscan needs is already on /simple/, you can make uscan work with it. There is one major problem and one small problem: 1. Major: The /simple/ URLs all have a #md5=hash and it’s non trivial to write a d/watch file that ignores them and uscan doesn’t by default. You can do it but it’s ugly and prone to copy/paste bugs. 2. The URLs on /simple/ point to /packages/, so it requires the 2 arg form of d/watch instead of the single arg form. So you can make uscan work right now with /simple/ (and a few people have) but #1 means that a few of the #debian-python people were not very happy with that solution. I can’t remove/modify that hash without causing issues with pip/easy_install though. Originally I was going to just make a /uscan/ that was /simple/ without the hash, but instead I suggested to #debian-python that a redirector might be better and there is now one at pypi.debian.net. We talked about this in #debian-python and there was concern that a new version of uscan wouldn’t be in Jessie and then wouldn’t cover the people who need it the most. Who needs it the most? We could fix it in unstable and backports. The DEHS data on tracker.debian.org comes from quantz.debian.org. which is currently using devscripts from back ports. No idea, I’m just repeating what folks said in #debian-python, I have no idea who runs uscan and on what platforms. Between fixing uscan and having a redirector I don’t have an opinion since neither one of those have an impact on what PyPI does. I don’t know if that’s true or not but I certainly think that uscan _should_ ignore anything that comes after a # (similarly to how it ignores anything that comes after a ?). Agreed. SR -- Stefano Rivera http://tumbleweed.org.za/ +1 415 683 3272 -- To UNSUBSCRIBE, email to debian-python-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/20150204213208.ga3...@bach.rivera.co.za --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA -- To UNSUBSCRIBE, email to debian-python-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/c118b6b9-f56e-46be-9a80-c5e934c1f...@stufft.io
Re: PyPI and debian/watch
On Feb 4, 2015, at 10:07 AM, Barry Warsaw ba...@debian.org wrote: On Feb 04, 2015, at 08:08 AM, Donald Stufft wrote: If it gets implemented it'll live at /uscan/ because it exists primarily to work around the deficiencies that exist in uscan (Particularly the dificulty in ignoring url fragments). Everyone else should just use the URLs at /simple/ which most systems use with no problem because they can parse the URLs and ignore the URL fragments (or use them for verifying the hash if need be). I'll just note that I've found the fragments inconvenient in other settings too. They aren't very user friendly since (IMHO) they add noise that users cutting and pasting urls generally don't care about. They also feel odd in that the md5 checksum doesn't fit what I think as a typical fragment. Traditionally, they are used to point to an anchor (sub-resource) within the parent resource. That's not the case here. http://en.wikipedia.org/wiki/Fragment_identifier has this to say: Several proposals have been made for fragment identifiers for use with plain text documents (which cannot store anchor metadata), or to refer to locations within HTML documents in which the author has not used anchor tags: As of September 2012 the Media Fragments URI 1.0 (basic) is a W3C Recommendation.[12] The Python Package Index appends the MD5 hash of a file to the URL as a fragment identifier.[13] If MD5 were unbroken (it is a broken hash function), it could be used to ensure the integrity of the package. https://pypi.python.org ... zodbbrowser-0.3.1.tar.gz#md5=38dc89f294b24691d3f0d893ed3c119c So even without the uscan incompatibility (which is just one of the two factors leading to noisy d/watch file), I think there's some value in fragment-less URLs. I understand the checksum isn't being used cryptographically here, but maybe thinking ahead to the use of more secure algorithms in the future can lead to a more flexible design: Legacy (if it indeed needs to be kept for backward compatibility): /simple/foo-x.y.z#md5=blah then: /simple/plain/foo-x.y.z /simple/sha256/foo-x.y.z#sha256=blah Long term PyPI is going to move away from trying to cram a bunch of information into a hyperlink and relying on HTML parsing and instead is going to move the installer APIs over to using something more suited to the task, most likely JSON. At that point we'll be able to design the API to be more extendable in this regards since we'll be able to do something like: { ..., hashes: { md5: ..., sha256: ..., }, ... } and the client can simply select which hash it wants to use. Long term the /simple/ API on PyPI will exist only for legacy purposes so people still using versions of pip, easy_install, etc that only support /simple/ will still be able to access PyPI. That doesn't really help uscan at all though since as far as I know uscan has no ability to parse JSON. As far as copy/pasting goes, the /simple/ API is an API, it's not designed to be human consumable but consumable by software. The UI centric pages at /pypi are the ones designed to be consumable by humans (Although currently PyPI puts the hash there as well, however Warehouse (aka PyPI 2.0) does not). The problem here really lies within uscan making assumptions about the structure of URLs and the content of the HTML on those pages. From looking at https://wiki.debian.org/debian/watch I'm guessing that it inherited those assumptions from when FTP was the more common way to distribute files instead of HTTP(S). That same page also mentions that qa.debian.org runs a number of redirectors for sites like SourceForge and GitHub so perhaps a better answer is for Debian QA to run a redirector for PyPI instead of PyPI implementing a redundant API endpoint with a slightly different layout and HTML primarily for Debian. One note in that regard is that the /simple/ indexes don't include the .asc files if someone has uploaded them however the old URLs that debian/watch used did. If that is something that is needed we could easily add them to the /simple/ pages. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA -- To UNSUBSCRIBE, email to debian-python-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/246c48b2-9f15-4827-aab4-b574f95a2...@stufft.io
Re: PyPI and debian/watch
On Feb 4, 2015, at 3:05 AM, Ben Finney ben+deb...@benfinney.id.au wrote: Tristan Seligmann mithra...@mithrandi.net writes: The debian/watch file I wrote for python-nacl (which also verifies the PGP signature) seems to work. I can't get PGP signature retrieval to rowk (“uscan warning: pgpsigurlmangle option exists, but the upstream keyring does not exist”) even with your suggested pattern. But I have also written a working uscan configuration:: opts=filenamemangle=s/\S+\/([^\/]+\.tar\.gz)#md5=[[:alnum:]]+$/$1/ \ https://pypi.python.org/simple/python-daemon/ \ \S+/python-daemon-(\S+)\.tar\.gz#md5=[[:alnum:]]+ \ debian Barry Warsaw ba...@debian.org writes: I'd love to be able to have something as simple as: version=3 https://pypi.python.org/simple/mypkg/mypkg-(.*).tar.gz which is close to what most packages probably use today, modulo the base url path. That would be great. But remember that the uscan documentation recommends a tighter matching pattern, so that would be:: version=3 https://pypi.python.org/simple/mypkg/mypkg-(.+).tar.gz I filed a bug against pypa/warehouse so hopefully we can get something better before Jessie is released (which is when I think there will be more pressure for a better solution, since most packages won't be updated during the freeze). https://github.com/pypa/warehouse/issues/358 Thanks very much! I'm not a fan of having it live at “…/uscan/” though. This is not specific to Debian, it's a sensible API design for all. If it gets implemented it'll live at /uscan/ because it exists primarily to work around the deficiencies that exist in uscan (Particularly the dificulty in ignoring url fragments). Everyone else should just use the URLs at /simple/ which most systems use with no problem because they can parse the URLs and ignore the URL fragments (or use them for verifying the hash if need be). --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA -- To UNSUBSCRIBE, email to debian-python-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/fa9f43c5-64c5-4cff-a972-30d162099...@stufft.io
Re: PyPI and debian/watch
On Feb 04, 2015, at 08:08 AM, Donald Stufft wrote: If it gets implemented it'll live at /uscan/ because it exists primarily to work around the deficiencies that exist in uscan (Particularly the dificulty in ignoring url fragments). Everyone else should just use the URLs at /simple/ which most systems use with no problem because they can parse the URLs and ignore the URL fragments (or use them for verifying the hash if need be). I'll just note that I've found the fragments inconvenient in other settings too. They aren't very user friendly since (IMHO) they add noise that users cutting and pasting urls generally don't care about. They also feel odd in that the md5 checksum doesn't fit what I think as a typical fragment. Traditionally, they are used to point to an anchor (sub-resource) within the parent resource. That's not the case here. http://en.wikipedia.org/wiki/Fragment_identifier has this to say: Several proposals have been made for fragment identifiers for use with plain text documents (which cannot store anchor metadata), or to refer to locations within HTML documents in which the author has not used anchor tags: As of September 2012 the Media Fragments URI 1.0 (basic) is a W3C Recommendation.[12] The Python Package Index appends the MD5 hash of a file to the URL as a fragment identifier.[13] If MD5 were unbroken (it is a broken hash function), it could be used to ensure the integrity of the package. https://pypi.python.org ... zodbbrowser-0.3.1.tar.gz#md5=38dc89f294b24691d3f0d893ed3c119c So even without the uscan incompatibility (which is just one of the two factors leading to noisy d/watch file), I think there's some value in fragment-less URLs. I understand the checksum isn't being used cryptographically here, but maybe thinking ahead to the use of more secure algorithms in the future can lead to a more flexible design: Legacy (if it indeed needs to be kept for backward compatibility): /simple/foo-x.y.z#md5=blah then: /simple/plain/foo-x.y.z /simple/sha256/foo-x.y.z#sha256=blah etc. Cheers, -Barry -- To UNSUBSCRIBE, email to debian-python-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/20150204100749.00106...@anarchist.wooz.org
Re: PyPI and debian/watch
On Feb 04, 2015, at 01:09 AM, Tristan Seligmann wrote: The debian/watch file I wrote for python-nacl (which also verifies the PGP signature) seems to work. It looks like this: version=3 opts=pgpsigurlmangle=s/\#md5.*$/.asc/,filenamemangle=s|.*/(.*)\#md5.*$|$1| \ https://pypi.python.org/simple/PyNaCl/ .*/PyNaCl-(.*)\.(?:zip|tgz|tbz2|txz|tar\.(?:gz|bz2|xz)).* It's a bit ugly, but it does work. Yep, thanks. Sandro also provided a working example in the pypa bitbucket issue. Of course I was hand-transcribing the example so it showed me that even a small typo will break it. I'd dearly love to have something simple and understandable that would be less prone to hard to debug cargo cult transcription errors. I've been chatting with Donald Stufft on IRC about possible simplifications for uscan d/watch files. I'd love to be able to have something as simple as: version=3 https://pypi.python.org/simple/mypkg/mypkg-(.*).tar.gz which is close to what most packages probably use today, modulo the base url path. I filed a bug against pypa/warehouse so hopefully we can get something better before Jessie is released (which is when I think there will be more pressure for a better solution, since most packages won't be updated during the freeze). https://github.com/pypa/warehouse/issues/358 Cheers, -Barry -- To UNSUBSCRIBE, email to debian-python-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/20150203191420.554f1...@anarchist.wooz.org