Hello Larry,
On 2012/10/16 16:12, Larry Masinter wrote:
I think it would be useful to find a system that requires general IRI syntax to
be more constrained.
Specs don't actively care. If we look at implementations, I found one
very quickly, in the standard library for Ruby:
> irb
irb(main):001:0> require 'uri'
irb(main):002:0> uri = URI.parse 'http://example.org/abc def/page.html'
URI::InvalidURIError: bad URI(is not URI?): ...
... [details of error removed]
irb(main):003:0> uri = URI.parse 'http://example.org/abcdef/page.html'
=> #<URI::HTTP:..... URL:http://example.org/abcdef/page.html>
irb(main):004:0> exit
[disclaimer: I'm a Ruby committer, but I have never touched that library
so far]
I assume that quite some other libraries may have similar behavior, but
I'm not familiar enough with Java/Python/PHP/Perl to do such a quick and
easy test.
XML Schema and anyURI may also be worth looking at. For good reasons,
it's not very popular around there these days, but some heavy hitters
still use it.
Of course, individual schemes can constrain their syntax in a scheme-specific
way (making anything that doesn't match the scheme template
invalid-as-instance-of-scheme even if valid-as-IRI).
Yes. One question is how to apply a such scheme-specific restrictions to
a much more lenitent base syntax. It may be very easy, or it may not.
And individual contexts (like "space separated list of IRIs") can provide additional
constraints ("before adding an IRI to a space separated list of IRIs, replace all spaces with
%20"). But those kinds of rules are layered on top of 3987(bis).
The advantage of having RFC 3986 and RFC 3987(bis) is that it's clear
that one does not have to say anything like "before adding an IRI to a
space separated list of IRIs, replace all spaces with %20". So lots of
specs currently don't say this.
I'd like to talk out some of these things in Atlanta, should we try to make it
a separate (bar) bof, or try to use the IRI working group time to talk about
this?
If you are looking to find examples of IRI/URI-related specs and
implementations that actually check (some of) the restrictions of the
IRI/URI spec, then I think the best way is to ask on a few of the
relevant mailing lists. Asking at the end of the WG meeting, and moving
further discussion to a bar (bof) would then be the next step (if still
necessary).
Regards, Martin.
-----Original Message-----
From: Ted Hardie [mailto:ted.i...@gmail.com]
Sent: Monday, October 15, 2012 1:00 PM
To: Larry Masinter
Cc: Robin Berjon; Anne van Kesteren; p...@w3.org; Peter Saint-Andre
(stpe...@stpeter.im); Pete Resnick (presn...@qualcomm.com); "Martin Dürst
(due...@it.aoyama.ac.jp)"; www-archive@w3.org
Subject: Re: URL work in HTML 5
On Mon, Oct 15, 2012 at 11:37 AM, Larry Masinter<masin...@adobe.com>
wrote:
I think that's the bigger implication -- the vision that the web supplants all
other (network) apps; for some systems, "URLs to non-Web things" is an empty
set.
My understanding of Peter's survey of other specs that make reference to
RFC 3987 was that there weren't any whose implementations relied on anything
other than the browser to do URL/IRI resolution and processing.
First, can you provide a pointer to the survey?
Second, while there may be systems for which the only handle for URIs
is the the browser, there are certainly systems for which that is not
true. To pick one produced close to when URIs became a full standard,
look at RFC 4088 (http://tools.ietf.org/html/rfc4088). I doubt there
are many browsers which dereference URIs like
snmp://example.com/bridge1;800002b804616263 with their own handlers.
URIs used internally to systems outside the web may not be easily seen
in a web-based corpus, but that does not mean that they are not there,
nor that shifting the parsing rules won't effect them.
regards,
Ted Hardie