Re: PyPI and debian/watch

2015-02-04 Thread Barry Warsaw
On Feb 04, 2015, at 10:53 AM, Donald Stufft wrote:

That same page also mentions that qa.debian.org runs a number of
redirectors for sites like SourceForge and GitHub so perhaps a better
answer is for Debian QA to run a redirector for PyPI instead of PyPI
implementing a redundant API endpoint with a slightly different layout and
HTML primarily for Debian.

+1

Cheers,
-Barry


pgp1m9Nwp58dg.pgp
Description: OpenPGP digital signature


Re: PyPI and debian/watch

2015-02-04 Thread Ben Finney
Tristan Seligmann mithra...@mithrandi.net writes:

 The debian/watch file I wrote for python-nacl (which also verifies the
 PGP signature) seems to work.

I can't get PGP signature retrieval to rowk (“uscan warning:
pgpsigurlmangle option exists, but the upstream keyring does not exist”)
even with your suggested pattern.

But I have also written a working uscan configuration::

opts=filenamemangle=s/\S+\/([^\/]+\.tar\.gz)#md5=[[:alnum:]]+$/$1/ \
https://pypi.python.org/simple/python-daemon/ \
\S+/python-daemon-(\S+)\.tar\.gz#md5=[[:alnum:]]+ \
debian


Barry Warsaw ba...@debian.org writes:

 I'd love to be able to have something as simple as:

 version=3
 https://pypi.python.org/simple/mypkg/mypkg-(.*).tar.gz

 which is close to what most packages probably use today, modulo the
 base url path.

That would be great. But remember that the uscan documentation
recommends a tighter matching pattern, so that would be::

version=3
https://pypi.python.org/simple/mypkg/mypkg-(.+).tar.gz

 I filed a bug against pypa/warehouse so hopefully we can get something
 better before Jessie is released (which is when I think there will be
 more pressure for a better solution, since most packages won't be
 updated during the freeze).

 https://github.com/pypa/warehouse/issues/358

Thanks very much!

I'm not a fan of having it live at “…/uscan/” though. This is not
specific to Debian, it's a sensible API design for all.

-- 
 \  “Probably the earliest flyswatters were nothing more than some |
  `\sort of striking surface attached to the end of a long stick.” |
_o__) —Jack Handey |
Ben Finney


-- 
To UNSUBSCRIBE, email to debian-python-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/85h9v2gkz5@benfinney.id.au



Re: PyPI and debian/watch

2015-02-04 Thread Tristan Seligmann
On 4 February 2015 at 10:05, Ben Finney ben+deb...@benfinney.id.au wrote:
 Tristan Seligmann mithra...@mithrandi.net writes:

 The debian/watch file I wrote for python-nacl (which also verifies the
 PGP signature) seems to work.

 I can't get PGP signature retrieval to rowk (“uscan warning:
 pgpsigurlmangle option exists, but the upstream keyring does not exist”)
 even with your suggested pattern.

You need a debian/upstream/signing-key.asc with the ASCII-armored PGP
keys which will be accepted for signing the release.
-- 
mithrandi, i Ainil en-Balandor, a faer Ambar


--
To UNSUBSCRIBE, email to debian-python-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: 
https://lists.debian.org/CAMcKhMRseQ1C-kz=ofe9s+9rdfxokrwcfp1pb0qgpsjkagm...@mail.gmail.com



Re: PyPI and debian/watch

2015-02-04 Thread Donald Stufft

 On Feb 4, 2015, at 11:20 AM, Barry Warsaw ba...@debian.org wrote:
 
 On Feb 04, 2015, at 10:53 AM, Donald Stufft wrote:
 
 That same page also mentions that qa.debian.org runs a number of
 redirectors for sites like SourceForge and GitHub so perhaps a better
 answer is for Debian QA to run a redirector for PyPI instead of PyPI
 implementing a redundant API endpoint with a slightly different layout and
 HTML primarily for Debian.
 
 +1
 
 Cheers,
 -Barry

Dunno the best way to give this to y'all but I wrote a thing:

https://github.com/dstufft/pypi-debian

I can transfer it on github or release it on PyPI or whatever. It shouldn't
really need any maintenance or anything but I'm probably not going to pay
attention to it if it does need any so someone else might want to actually
own it.

---
Donald Stufft
PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA


-- 
To UNSUBSCRIBE, email to debian-python-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/363795e0-f96f-423f-b6a6-d3f3fbf76...@stufft.io



Re: PyPI and debian/watch

2015-02-04 Thread Tianon Gravi
On 4 February 2015 at 06:08, Donald Stufft don...@stufft.io wrote:
 If it gets implemented it'll live at /uscan/ because it exists primarily to
 work around the deficiencies that exist in uscan (Particularly the dificulty
 in ignoring url fragments).

This seems like we're building a workaround to a tool we could
theoretically change. :(

debian/watch has a version=3, which is presumably so that there
can be a version=4 when deficiencies are discovered -- wouldn't it
be worthwhile to consider revbumping the watch format and updating
uscan to have some improved support for edge cases like this?  I know
uscan has some other open bugs too that could use some thought towards
a more flexible format to handle cases like this.

♥,
- Tianon
  4096R / B42F 6819 007F 00F8 8E36  4FD4 036A 9C25 BF35 7DD4


--
To UNSUBSCRIBE, email to debian-python-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: 
https://lists.debian.org/cahnknk2j1huhddu49nqkzag9vf5opcsj49yuulnuterm3jg...@mail.gmail.com



Re: PyPI and debian/watch

2015-02-04 Thread Donald Stufft

 On Feb 4, 2015, at 3:02 PM, Tianon Gravi admwig...@gmail.com wrote:
 
 On 4 February 2015 at 06:08, Donald Stufft don...@stufft.io wrote:
 If it gets implemented it'll live at /uscan/ because it exists primarily to
 work around the deficiencies that exist in uscan (Particularly the dificulty
 in ignoring url fragments).
 
 This seems like we're building a workaround to a tool we could
 theoretically change. :(
 
 debian/watch has a version=3, which is presumably so that there
 can be a version=4 when deficiencies are discovered -- wouldn't it
 be worthwhile to consider revbumping the watch format and updating
 uscan to have some improved support for edge cases like this?  I know
 uscan has some other open bugs too that could use some thought towards
 a more flexible format to handle cases like this.

We talked about this in #debian-python and there was concern that a new version
of uscan wouldn’t be in Jessie and then wouldn’t cover the people who need it
the most.

I don’t know if that’s true or not but I certainly think that uscan _should_
ignore anything that comes after a # (similarly to how it ignores anything that
comes after a ?). That would solve the largest problem, that the URL fragment
is hard to remove from the d/watch file. The other problem is that 
/simple/whatever/
has files located at /packages/stuff but I believe that’s not very hard to 
work
around.

---
Donald Stufft
PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA


--
To UNSUBSCRIBE, email to debian-python-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/bc9910d4-1c34-4856-9f0f-b46a99da1...@stufft.io



Re: PyPI and debian/watch

2015-02-04 Thread Tianon Gravi
On 4 February 2015 at 13:06, Donald Stufft don...@stufft.io wrote:
 We talked about this in #debian-python and there was concern that a new 
 version
 of uscan wouldn’t be in Jessie and then wouldn’t cover the people who need it
 the most.

Ah right, that makes sense -- I forgot that we were talking about this
specifically in the context of getting some kind of fix into Jessie.

♥,
- Tianon
  4096R / B42F 6819 007F 00F8 8E36  4FD4 036A 9C25 BF35 7DD4


--
To UNSUBSCRIBE, email to debian-python-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: 
https://lists.debian.org/cahnknk3ccon84fffzhxdkgpsneym7dikaonwkorjnqzzuxa...@mail.gmail.com



Re: PyPI and debian/watch

2015-02-04 Thread Stefano Rivera
Hi Donald (2015.02.04_22:06:25_+0200)
  On 4 February 2015 at 06:08, Donald Stufft don...@stufft.io wrote:
  If it gets implemented it'll live at /uscan/ because it exists primarily to
  work around the deficiencies that exist in uscan (Particularly the 
  dificulty
  in ignoring url fragments).

Would it be that hard to have fake directory listings on /simple/?
I mean, surely keeping compatibility there is simpler than having a
second endpoint just for Debian.

 We talked about this in #debian-python and there was concern that a new 
 version
 of uscan wouldn’t be in Jessie and then wouldn’t cover the people who need it
 the most.

Who needs it the most? We could fix it in unstable and backports. The
DEHS data on tracker.debian.org comes from quantz.debian.org. which is
currently using devscripts from backports.


 I don’t know if that’s true or not but I certainly think that uscan _should_
 ignore anything that comes after a # (similarly to how it ignores anything 
 that
 comes after a ?).

Agreed.

SR

-- 
Stefano Rivera
  http://tumbleweed.org.za/
  +1 415 683 3272


-- 
To UNSUBSCRIBE, email to debian-python-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20150204213208.ga3...@bach.rivera.co.za



Re: PyPI and debian/watch

2015-02-04 Thread Donald Stufft

 On Feb 4, 2015, at 4:32 PM, Stefano Rivera stefa...@debian.org wrote:
 
 Hi Donald (2015.02.04_22:06:25_+0200)
 On 4 February 2015 at 06:08, Donald Stufft don...@stufft.io wrote:
 If it gets implemented it'll live at /uscan/ because it exists primarily to
 work around the deficiencies that exist in uscan (Particularly the 
 dificulty
 in ignoring url fragments).
 
 Would it be that hard to have fake directory listings on /simple/?
 I mean, surely keeping compatibility there is simpler than having a
 second endpoint just for Debian.

All the data that uscan needs is already on /simple/, you can make uscan work 
with
it. There is one major problem and one small problem:

1. Major: The /simple/ URLs all have a #md5=hash and it’s non trivial to 
write a
   d/watch file that ignores them and uscan doesn’t by default. You can do it 
but
   it’s ugly and prone to copy/paste bugs.

2. The URLs on /simple/ point to /packages/, so it requires the 2 arg form of
   d/watch instead of the single arg form.

So you can make uscan work right now with /simple/ (and a few people have) but 
#1
means that a few of the #debian-python people were not very happy with that 
solution.
I can’t remove/modify that hash without causing issues with pip/easy_install 
though.
Originally I was going to just make a /uscan/ that was /simple/ without the 
hash,
but instead I suggested to #debian-python that a redirector might be better and 
there
is now one at pypi.debian.net.


 
 We talked about this in #debian-python and there was concern that a new 
 version
 of uscan wouldn’t be in Jessie and then wouldn’t cover the people who need it
 the most.
 
 Who needs it the most? We could fix it in unstable and backports. The
 DEHS data on tracker.debian.org comes from quantz.debian.org. which is
 currently using devscripts from back ports.

No idea, I’m just repeating what folks said in #debian-python, I have no idea 
who
runs uscan and on what platforms. Between fixing uscan and having a redirector I
don’t have an opinion since neither one of those have an impact on what PyPI
does.

 
 
 I don’t know if that’s true or not but I certainly think that uscan _should_
 ignore anything that comes after a # (similarly to how it ignores anything 
 that
 comes after a ?).
 
 Agreed.
 
 SR
 
 -- 
 Stefano Rivera
  http://tumbleweed.org.za/
  +1 415 683 3272
 
 
 -- 
 To UNSUBSCRIBE, email to debian-python-requ...@lists.debian.org
 with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
 Archive: https://lists.debian.org/20150204213208.ga3...@bach.rivera.co.za
 

---
Donald Stufft
PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA


--
To UNSUBSCRIBE, email to debian-python-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/c118b6b9-f56e-46be-9a80-c5e934c1f...@stufft.io



Re: PyPI and debian/watch

2015-02-04 Thread Donald Stufft

 On Feb 4, 2015, at 10:07 AM, Barry Warsaw ba...@debian.org wrote:
 
 On Feb 04, 2015, at 08:08 AM, Donald Stufft wrote:
 
 If it gets implemented it'll live at /uscan/ because it exists primarily to
 work around the deficiencies that exist in uscan (Particularly the dificulty
 in ignoring url fragments). Everyone else should just use the URLs at 
 /simple/
 which most systems use with no problem because they can parse the URLs and
 ignore the URL fragments (or use them for verifying the hash if need be).
 
 I'll just note that I've found the fragments inconvenient in other settings
 too.  They aren't very user friendly since (IMHO) they add noise that users
 cutting and pasting urls generally don't care about.  They also feel odd in
 that the md5 checksum doesn't fit what I think as a typical fragment.
 Traditionally, they are used to point to an anchor (sub-resource) within the
 parent resource.  That's not the case here.
 
 http://en.wikipedia.org/wiki/Fragment_identifier
 
 has this to say:
 
 
 Several proposals have been made for fragment identifiers for use with plain
 text documents (which cannot store anchor metadata), or to refer to locations
 within HTML documents in which the author has not used anchor tags:
 
 As of September 2012 the Media Fragments URI 1.0 (basic) is a W3C
 Recommendation.[12]
 
 The Python Package Index appends the MD5 hash of a file to the URL as a
 fragment identifier.[13] If MD5 were unbroken (it is a broken hash function),
 it could be used to ensure the integrity of the package.
 
 https://pypi.python.org ... 
 zodbbrowser-0.3.1.tar.gz#md5=38dc89f294b24691d3f0d893ed3c119c
 
 
 So even without the uscan incompatibility (which is just one of the two
 factors leading to noisy d/watch file), I think there's some value in
 fragment-less URLs.  I understand the checksum isn't being used
 cryptographically here, but maybe thinking ahead to the use of more secure
 algorithms in the future can lead to a more flexible design:
 
 Legacy (if it indeed needs to be kept for backward compatibility):
 
 /simple/foo-x.y.z#md5=blah
 
 then:
 
 /simple/plain/foo-x.y.z
 /simple/sha256/foo-x.y.z#sha256=blah
 

Long term PyPI is going to move away from trying to cram a bunch of information
into a hyperlink and relying on HTML parsing and instead is going to move the
installer APIs over to using something more suited to the task, most likely
JSON. At that point we'll be able to design the API to be more extendable in
this regards since we'll be able to do something like:

{
...,
hashes: {
md5: ...,
sha256: ...,
},
...
}

and the client can simply select which hash it wants to use. Long term the
/simple/ API on PyPI will exist only for legacy purposes so people still using
versions of pip, easy_install, etc that only support /simple/ will still be
able to access PyPI.

That doesn't really help uscan at all though since as far as I know uscan has
no ability to parse JSON.

As far as copy/pasting goes, the /simple/ API is an API, it's not designed to
be human consumable but consumable by software. The UI centric pages at /pypi
are the ones designed to be consumable by humans (Although currently PyPI puts
the hash there as well, however Warehouse (aka PyPI 2.0) does not).

The problem here really lies within uscan making assumptions about the
structure of URLs and the content of the HTML on those pages. From looking at
https://wiki.debian.org/debian/watch I'm guessing that it inherited those
assumptions from when FTP was the more common way to distribute files instead
of HTTP(S). That same page also mentions that qa.debian.org runs a number of
redirectors for sites like SourceForge and GitHub so perhaps a better answer
is for Debian QA to run a redirector for PyPI instead of PyPI implementing a
redundant API endpoint with a slightly different layout and HTML primarily for
Debian.

One note in that regard is that the /simple/ indexes don't include the .asc
files if someone has uploaded them however the old URLs that debian/watch used
did. If that is something that is needed we could easily add them to the
/simple/ pages.

---
Donald Stufft
PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA


--
To UNSUBSCRIBE, email to debian-python-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/246c48b2-9f15-4827-aab4-b574f95a2...@stufft.io



Re: PyPI and debian/watch

2015-02-04 Thread Donald Stufft

 On Feb 4, 2015, at 3:05 AM, Ben Finney ben+deb...@benfinney.id.au wrote:
 
 Tristan Seligmann mithra...@mithrandi.net writes:
 
 The debian/watch file I wrote for python-nacl (which also verifies the
 PGP signature) seems to work.
 
 I can't get PGP signature retrieval to rowk (“uscan warning:
 pgpsigurlmangle option exists, but the upstream keyring does not exist”)
 even with your suggested pattern.
 
 But I have also written a working uscan configuration::
 
 opts=filenamemangle=s/\S+\/([^\/]+\.tar\.gz)#md5=[[:alnum:]]+$/$1/ \
https://pypi.python.org/simple/python-daemon/ \
\S+/python-daemon-(\S+)\.tar\.gz#md5=[[:alnum:]]+ \
debian
 
 
 Barry Warsaw ba...@debian.org writes:
 
 I'd love to be able to have something as simple as:
 
 version=3
 https://pypi.python.org/simple/mypkg/mypkg-(.*).tar.gz
 
 which is close to what most packages probably use today, modulo the
 base url path.
 
 That would be great. But remember that the uscan documentation
 recommends a tighter matching pattern, so that would be::
 
version=3
https://pypi.python.org/simple/mypkg/mypkg-(.+).tar.gz
 
 I filed a bug against pypa/warehouse so hopefully we can get something
 better before Jessie is released (which is when I think there will be
 more pressure for a better solution, since most packages won't be
 updated during the freeze).
 
 https://github.com/pypa/warehouse/issues/358
 
 Thanks very much!
 
 I'm not a fan of having it live at “…/uscan/” though. This is not
 specific to Debian, it's a sensible API design for all.
 

If it gets implemented it'll live at /uscan/ because it exists primarily to
work around the deficiencies that exist in uscan (Particularly the dificulty
in ignoring url fragments). Everyone else should just use the URLs at /simple/
which most systems use with no problem because they can parse the URLs and
ignore the URL fragments (or use them for verifying the hash if need be).

---
Donald Stufft
PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA


--
To UNSUBSCRIBE, email to debian-python-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/fa9f43c5-64c5-4cff-a972-30d162099...@stufft.io



Re: PyPI and debian/watch

2015-02-04 Thread Barry Warsaw
On Feb 04, 2015, at 08:08 AM, Donald Stufft wrote:

If it gets implemented it'll live at /uscan/ because it exists primarily to
work around the deficiencies that exist in uscan (Particularly the dificulty
in ignoring url fragments). Everyone else should just use the URLs at /simple/
which most systems use with no problem because they can parse the URLs and
ignore the URL fragments (or use them for verifying the hash if need be).

I'll just note that I've found the fragments inconvenient in other settings
too.  They aren't very user friendly since (IMHO) they add noise that users
cutting and pasting urls generally don't care about.  They also feel odd in
that the md5 checksum doesn't fit what I think as a typical fragment.
Traditionally, they are used to point to an anchor (sub-resource) within the
parent resource.  That's not the case here.

http://en.wikipedia.org/wiki/Fragment_identifier

has this to say:


Several proposals have been made for fragment identifiers for use with plain
text documents (which cannot store anchor metadata), or to refer to locations
within HTML documents in which the author has not used anchor tags:

As of September 2012 the Media Fragments URI 1.0 (basic) is a W3C
Recommendation.[12]

The Python Package Index appends the MD5 hash of a file to the URL as a
fragment identifier.[13] If MD5 were unbroken (it is a broken hash function),
it could be used to ensure the integrity of the package.

https://pypi.python.org ... 
zodbbrowser-0.3.1.tar.gz#md5=38dc89f294b24691d3f0d893ed3c119c


So even without the uscan incompatibility (which is just one of the two
factors leading to noisy d/watch file), I think there's some value in
fragment-less URLs.  I understand the checksum isn't being used
cryptographically here, but maybe thinking ahead to the use of more secure
algorithms in the future can lead to a more flexible design:

Legacy (if it indeed needs to be kept for backward compatibility):

/simple/foo-x.y.z#md5=blah

then:

/simple/plain/foo-x.y.z
/simple/sha256/foo-x.y.z#sha256=blah

etc.

Cheers,
-Barry


-- 
To UNSUBSCRIBE, email to debian-python-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20150204100749.00106...@anarchist.wooz.org



Re: PyPI and debian/watch

2015-02-03 Thread Barry Warsaw
On Feb 04, 2015, at 01:09 AM, Tristan Seligmann wrote:

The debian/watch file I wrote for python-nacl (which also verifies the
PGP signature) seems to work. It looks like this:

version=3
opts=pgpsigurlmangle=s/\#md5.*$/.asc/,filenamemangle=s|.*/(.*)\#md5.*$|$1| \
 https://pypi.python.org/simple/PyNaCl/
.*/PyNaCl-(.*)\.(?:zip|tgz|tbz2|txz|tar\.(?:gz|bz2|xz)).*

It's a bit ugly, but it does work.

Yep, thanks.  Sandro also provided a working example in the pypa bitbucket
issue.  Of course I was hand-transcribing the example so it showed me that
even a small typo will break it.  I'd dearly love to have something simple and
understandable that would be less prone to hard to debug cargo cult
transcription errors.  I've been chatting with Donald Stufft on IRC about
possible simplifications for uscan d/watch files.

I'd love to be able to have something as simple as:

version=3
https://pypi.python.org/simple/mypkg/mypkg-(.*).tar.gz

which is close to what most packages probably use today, modulo the base url
path.

I filed a bug against pypa/warehouse so hopefully we can get something better
before Jessie is released (which is when I think there will be more pressure
for a better solution, since most packages won't be updated during the
freeze).

https://github.com/pypa/warehouse/issues/358

Cheers,
-Barry


-- 
To UNSUBSCRIBE, email to debian-python-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20150203191420.554f1...@anarchist.wooz.org