Re: More comments on access-control

Anne van Kesteren Mon, 19 Nov 2007 14:28:54 -0800


Thanks a lot Ian! A few questions below.



On Wed, 14 Nov 2007 20:14:03 +0100, Ian Hickson <[EMAIL PROTECTED]> wrote:

On Wed, 14 Nov 2007, Anne van Kesteren wrote:

  http://dev.w3.org/2006/waf/access-control/


1.1 has an example that reads:

   Access-Control: <hello-world.invalid>

...which seems invalid.


Fixed.

"case-insensitive match" is defined poorly. If it is intended to _only_be about swapping a-z for A-Z, it should say so explicitly, not inparenthesis. If it is about full Unicode mapping, then it should bestated appropriately and the a-z part should be removed. Also, generallyit is
better to lowercase and compare than uppercase and compare, since in full
Unicode cases the lowercase versions are more canonical iirc.


Fixed.

The algorithm to "obtain the values from a space-separated list" mixesits tenses. It starts in the simple present ("must replace"), and then
switches to the present progressive ("dropping ... and then chopping").
The way it is phrased doesn't technically define how you obtain values,it defines how you replace characters, which for some reason involves
chopping the string.

Any idea how you're going to changehttp://www.w3.org/TR/2007/CR-xbl-20070316/#attributes0 as that's prettymuch what text I'm reusing.

2.1 Access Item: "When the access item is used as part of the
Access-Control HTTP header authors must specify the result of applyingthe ToASCII algorithm to the internationalized domain name as HTTP doesnot
support Unicode." still doesn't make sense to me. The requirement is that
the author provide a purely ASCII domain name, not that they take an IDN
and apply ToASCII, IMHO.


Fixed.

2.1 Access Item: Example "http://example.org:*"; is said to be invalid but
as far as I can tell it is valid.


Fixed.

Why is the "*." bit redundant in the domain part? How do I make sure
something matches "livejournal.com" but not "ianhickson.livejournal.com"?


  allow <livejournal.com> exclude <ianhickson.livejournal.com>

or more generic

  allow <livejournal.com> exclude <*.livejournal.com>

There are numerous hosts where the subdomain space isn't trusted butwhere the hostname itself is secure, and "example.com" doesn't at allconvey
that all subdomains are also trusted. I think we should require
"*.example.com" to indicate that subdomains are trusted.


Writing

  allow <example.com> <*.example.com>

was expected to be the general case and therefore we previously decided togo for


  allow <example.com>

to address that case. I'm not really comfortable with revisiting that onceagain.

It actually seems that even in the spec there is confusion about this,for example there is this example:
   Access-Control: allow <example.org> <*.example.org>

Examples are not always updated when the normative text doesunfortunately. Fixed now.

2.4. Referer-Root (sic) HTTP header: Do we need to continue misspelling
this?


It seems more consistent with the existing header.

3.1. Cross-site Access Request: "followed by the port (defaulting to the
default port for the scheme) of the resource" -- it makes no sense to
default the port in this case, since the resource had to have a port for
the request to have been made in the first place.


Fixed.

3.1. Cross-site Access Request: "of the resource from which the request
originated" -- is this true? Isn't it of the resource that the calling
spec wants used as the origin? e.g. in XHR I would imagine that theactual URI used would be the origin, which (e.g. in the case of data:URIs) might not match the resource's own URI at all. The next paragraphseems to agree with me.


Fixed.

3.1. Cross-site Access Request: Does the referrer root URI include the
port even if it is the default port?


That's what the definition says, no?

3.1. Cross-site Access Request: what does "Specifications are strongly
encouraged to define this in equivalent ways." mean?

I reworded this. The intention is that specifications base it on the same"source" as much as possible.

3.1. Cross-site Access Request: "As this algorithm is used by other
specifications, those specifications must ensure to handle all return
values. Specifications may ignore "reason" if "error" is "true"." -- this
paragraph makes no sense at this point. What algorithm? What return
values? What are "reason" and "error"? I recommend, before thisparagraph, giving an overview of what the algorithms can return.


Done.

3.1.1. Generic Cross-site Access Request Algorithms: "are same-origin" is
not defined yet.


Defined.

3.1.1. Generic Cross-site Access Request Algorithms: It's not clear which
algorithm "this algorithm" is. The "Generic Cross-site Access Request
Algorithms"? The "generic redirect steps"?


Reworded.

3.1.1. Generic Cross-site Access Request Algorithms: What does
"transparently follow the redirect while observing the set of request
rules" mean?

It somehow needs to point back to the algorithm that invoked it wherethere is a list of "request rules" which define what to do in case of anetwork error, redirect, etc.

Tuples are denoted (like, this) not "like, this". (e.g. in 3.1.1. Generic
Cross-site Access Request Algorithms.)


Fixed.

In fact in general you seem to
overuse quote marks -- I recommend only using them for strings, quotes,
euphemisms, and sarcasm, not for variables and literals.

If you have suggestions for what to use instead that would be welcome. I'moften wondering what would be best to use in a particlar case.

3.1.2. Cross-site GET Access Request: "Perform an access check" isn't
defined yet nor hyperlinked. Same applies in "3.1.3. Cross-site Non-GET
Access Request".


Fixed.

3.1.2. Cross-site GET Access Request: Why do you invent "current request
URI"? It's just given the value of "request URI" and seems to only beused once, so why not just use "request URI"?

The idea is that "current request URI" is updated during a redirect and"request URI" always points to the initial starting point. I suppose wecould just update "request URI" along the way. I wasn't sure if that wouldbe confusing or not.

I'm assuming this is related to
the "macro" steps in "3.1.1. Generic Cross-site Access Request
Algorithms", but it isn't clear to me how this all works. For example,
those refer to "origin" but I don't know what origin that is.

That's defined at the start of the algorithm that invokes it. It's thereferrer root URI.

3.1.3. Cross-site Non-GET Access Request: The first paragraph has theMUST for the list of steps, but the second paragraph confuses matters bybeing
"in the way".


Reordered.

3.1.3. Cross-site Non-GET Access Request: What is the "target URI"?


Typo I think. I can no longer find it. Probably should've been request URI.

3.1.3. Cross-site Non-GET Access Request: Again with the mention of
"origin" -- whose origin? Where does it come from? It doesn't seem to be
any of the arguments passed from the other spec.

It is defined at the start of the algorithm, no? "Let origin be thereferrer root URI."

"If there is a Method-Check-Expires  HTTP response headers that can be
successfully parsed it must be honered." misspells "honored", but in any
case it doesn't define what honoring it means. It should probably say
instead that the entry must be removed once the current time exceeds the
time specified by the header, or some such.


Tried a fix.

I assume how to parse the header is defined somewhere?

It's no better defined than the HTTP-date production (also used in theExpires header). I'm afraid to look into that.

3.2. Access Control Check: "The second subsection of this section" is
confusing. I couldn't tell if "this section" was section 3 or section3.2, and whethe the second subsection was 3.2, or 3.2.2. I'd just remove
paragraphs that tell you what you're about to read, frankly.

Ok.

The way you have the "temp method list" defined, you don't cache as
much as you should. Consider a resource with the following:

   <?access-control allow="example.com" method="POST"?>
   <?access-control allow="example.com" method="PUT"?>
   <?access-control allow="example.com" method="DELETE"?>

Now imagine you do a POST followed by a PUT, followed by another POST.

Ideally, we should send a single GET, and then the POST, and then thePUT, and then the final POST, because we know the PUT will succeed.However,

instead, we will send a GET, a POST, another GET, a PUT, and then a POST.

Actually, the idea *was* that the PUT would simply not be allowed. Errorflag to "fail" and "detail" to "network". I guess we should revisit that.See below.

I believe we should cache all the methods that are allowed, not just the
methods of the access-control item that was matched.

Ok, so the idea is to keep "looping" and adding methods to the list whenit's ok?

Incidentally, you should mention whether the authorization request cache
can have multiple items with the same key. (It seems that it can.)


The idea is that you can't. When would this be possible?

The rules for processing access-control PIs will drop any PI with a
method="" pseudo-attribute at the moment. In fact the pseudo-attribute is
generally not supported by the algorithm as far as I can tell.


Fixed.

The rules for processing access-control PIs look like they won't drop PIs
with multiple pseudo-attributes of the same name other than exclude="".
e.g. <?access-control allow="example.com" allow="example.com"?> doesn't
get dropped by the current rules.

Duplicate pseudo-attributes are a parse error per the <?xml-stylesheet?>specification.

3.3. Access Item Check, step 1: This line is confusing. You are letting
the algorithm's parameters be overwritten by undefined variables. I think
you mean "let origin be..." and "let item be..." not the other wayaround.


Fixed.

3.3. Access Item Check, step 6: how can "origin" not have a scheme?


Fixed.

Kind regards,


--
Anne van Kesteren
<http://annevankesteren.nl/>
<http://www.opera.com/>

Re: More comments on access-control

Reply via email to