On Wed, 18 Nov 2015 09:02:31 +1300, Robert Collins <robe...@robertcollins.net> wrote: > On 18 November 2015 at 08:53, Paul Moore <p.f.mo...@gmail.com> wrote: > > On 17 November 2015 at 18:43, Robert Collins <robe...@robertcollins.net> > > wrote: > >>> By including the URL syntax, we're mandating that conforming > >>> implementations *have* to trap malformed URLs early, and can't defer > >>> that validation to the URL library being used to process the URL. > >> > >> I don't understand how we're mandating that. > > > > urlspec = '@' wsp* <URI_reference> > > > > combined with > > > > URI_reference = <URI | relative_ref> > > URI = scheme ':' hier_part ('?' query )? ( '#' fragment)? > > (etc) > > > > implies that conforming parsers have to validate that what follows '@' > > must conform to the URI definition. So they have to reject @::::: > > because ::::: is not a valid URI. But why bother? It's extra work, and > > given that all an implementation will ever do with the URI_reference > > is pass it to a function that treats it as a URI, and that function > > will do all the validation you need. > > > > I'd argue that the spec can simply say > > > > URI_reference = <string with no whitespace> > > > > The discussion of how a urlspec is used can point out that the string > > will be assumed to be a URI. > > > > A library that parsed any non-whitespace string as a URI_reference > > would be just as useful for all practical purposes, and much easier to > > write (and test!) But it would technically be non-conformant to this > > PEP. > > > > Personally, I don't actually care all that much, as I probably won't > > ever write a library that implements this spec. The packaging library > > will be fine for me. But given that the point of writing the > > interoperability PEPs is to ensure people *can* write alternative > > implementations, I'm against adding complexity and implementation > > burden that has no practical benefit. > > > I'm still struggling to understand. > > I see two angles; the first is on what is accepted or not by an > implementation: > The reference here is not the implementation - its a *reference*. An > implementation whose URI handling can't handle std-66 URI's that > another ones can would lead to interop issues : and thats what we're > trying to avoid. An alternative implementation whose URI handling has > some extension that means it handles things that other implementations > don't would accept everything the PEP mandates but also accept more - > leading to interop issues. Some interop issues (e.g. pip handles > git+https:// urls, setuptools doesn't) are not covered yet, but thats > a pep-440 issue (at least, the way things are split up today) - so I > don't want to dive into that. > > The second is on whether the implementation achieves that acceptance > up front in its parsing, or on the backend in its URI library. And I > could care way less which way around it does it. We're not defining > implementation, but we are defining the language. > > As I understand it, you and Antoine are saying that the current PEP > *does* define implementation because folk can't trust their URI > library to error appropriately - and thats the bit I don't understand. > Just parse however you want as an author, and cross check against the > full grammar here in case of doubt.
OK, so it *is* the case that the PEP is mandating that a conforming implementation has to accept valid and reject invalid URLs according to the grammar in the PEP, but not *how* or *when* it does that (the implementation). So "trap malformed URLs early" is false, but "trap malformed URLs" is true, if you want to be a conformant implementation. --David _______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig