Hi Jon,
It seems to me that the major difference here is that we have different
sets of base requirements. And as long as we can't agree on the problem
to be solved, there is simply no way we can agree on the solution.
I think that while the working group has been thinking of a way to make
HTTP transfers safe cross origin, you are saying that we don't need to,
we can simply make all communication go over JSONRequest over HTTP.
While this is true, I don't agree that it's a very practical solution in
all cases. I think there is desire for a richer feature set where things
like the examples in requirement 10 work.
I think that sums it up pretty well, but I have still added some
comments below. I think though at some point we'll just have to agree to
disagree.
Comments below:
Jon Ferraiolo wrote:
Hi Jonas,
Thanks for taking the time to provide in-depth responses. See below.
Jon
Jonas Sicking <[EMAIL PROTECTED]> wrote on 01/18/2008 11:08:12 AM:
> Jon Ferraiolo wrote:
> > Hi Anne,
> > The FAQ was very helpful in documenting the rationale behind some
of the
> > approaches in the Access Control spec. However, I disagree with
most of
> > the reasoning, as shown below. The use case are a step in the right
> > direction, and I have some feedback on them. (Further down)
> >
> > >Access Control for Cross-site Requests DESIGN DECISION FAQ
> > >
> > >** Why is there a second check for non-GET requests?
> > >
> > >For non-GET requests two checks with Access-Control HTTP headers and
> > ><?access-control?> processing instructions are performed. Initially a
> > >"permission to make the request" check is done on the response to the
> > >authorization request. And then a "permission to read" check is done
> > on the
> > >response of the actual request. Both of these checks need to
succeed in
> > >order for success to be relayed to the protocol (e.g.
XMLHttpRequest).
> >
> > I appreciate the attention that the WG has put into trying to
promote a
> > secure approach to deliverying POST requests, but IMO it would be
better
> > to pursue a different technical strategy, such as what we see in
> > JSONRequest or something derivative of that. With JSONRequest,
POSTs are
> > allowed without requiring a prior GET (or HEAD or OPTIONS).
JSONRequest
> > also can be made available via a JavaScript implementation that
works in
> > today's browsers. Its approach is simpler for everyone involved, both
> > server-side developers and client-side (i.e., JavaScript) developers.
> > I'm not sure if JSONRequest is perfect as is (e.g., only supports JSON
> > and not XML), but it was designed by Doug Crockford, who gives
talks at
> > Ajax conferences on web security, and IMO he has a good handle on
> > security issues.
>
> Using JSONRequest here would not fulfill requirement 7 from the spec as
> JSONRequest can only be used to POST JSON data.
This has been pointed out before, but
(1) JSON can wrap other types of data, such as {xml:"<foo>foo</foo>"}
(2) I have claimed in prior email that JSONRequest would be better if it
supported XML natively and have suggested how JSONRequest might be
changed to do this
I don't see how you could do this while fulfilling requirement 1.
Suggestions appreciated.
Regarding requirement #7, it would be good to include a specific list of
target datatypes, such as JSON, XML, plain text and HTML fragments,
rather than phrasing it as "we should not limit ourselves to content of
a particular type", which is phrased more to exclude JSONRequest as an
alternative rather than help guide the WG towards designing appropriate
technology.
I don't see how we could in this WG possibly imagine all data formats
that will be used on the web. Just a few years ago everyone was all
about XML. These days JSON seems to have a lot of momentum too. Who can
say what's to come next.
I agree the requirement could be phrased better though. Possibly
something like:
We should not limit the data types that can be transferred using
access-control. This includes data both sent to the server (such as
using POST) as well as data received from the server (such as from a GET
request).
> Maybe we need to clarify that 7 applies to both POSTing and GETtin data.
> Anne, would you mind adding text to state that? We should probably
> expand 5 to include sending of data too.
>
> I don't see how JSONRequest can be expanded to include posting XML data
> without violating requirment 1, especially the second pullet point.
As designed today, JSONRequest requires wrapping XML inside of JSON. If
JSONRequest were extended to support XML, then define this enhancement
to JSONRequest such that it takes into account that second bullet.
Again, I don't see how you could do this without allowing reading of any
XML data on any server, without explicit server consent?
Please provide a concrete proposal rather than simply "json extended to
support xml".
> > >** Why are cookies and authentication information sent in the
request?
> > >
> > >Sending cookies and authentication information enables user-specific
> > >cross-site widgets (external XBL file). It also allows for a user
> > >authenticated data storage API that services can use to store
data in.
> >
> > As I have said previously, I disagree that XBL (and XSLT) should
impact
> > decisions about how to provide the best technology for cross-site data
> > access. The XBL and XSLT specs can say that user agents must allow
> > cross-domain access, just as is allowed for CSS stylesheets.
>
> I don't understand why you think XML data is unimportant? I don't
> believe that one-format-to-rule-them-all is ever going to work. It
> hasn't in the past.
See above.
>
> > Where did the idea of user-specific widgets come from, anyway? IMO,
that
> > would be a very, very low priority (approaching zero).
>
> Why? User specific data seems very useful. It would be great to be able
> to create mashups that pull my calender data from google calender as
> well as my addressbook from yahoo mail.
Hmm. That's a much more reasonable example that others I have seen. (As
I say over and over, I don't buy the XSLT and XBL arguments.)
However, the trend in mashup space has been to put gadgets into
(different domain) IFRAMEs for security reasons. The Gadgets spec
strongly encourages developers to create gadgets that are self-contained
and therefore can live in a sandbox (i.e., IFRAME). IBM's QEDwiki does
this also. My understanding is that IFRAMEs are the technology of choice
for this scenario for most of the big mashup providers these days.
Because of this, at OpenAjax we have some mashup initiatives that
embrace the IFRAME approach, and focused on how to send messages across
IFRAMEs in a secure manner. IBM Research has a paper on an approach
called SMash
(http://domino.research.ibm.com/library/cyberdig.nsf/papers/0EE2D79F8BE461CE8525731B0009404D/$File/RT0742.pdf)
and contributed the open source for SMash to OpenAjax Alliance. Yahoo
has talked about a similar approach ("CrossFrame":
http://ajax.sys-con.com/read/473127.htm).
Note that if a gadget goes into its own IFRAME, then cookies become much
less of an issue because the gadget goes its own sandbox'd HTML page
that whose HTML and JavaScript is fully controlled by the organization
that created the gadget.
First of all, IFRAMEs will not work for cross site communication without
extending the same-origin policy in very complicated and scary ways.
Second, IFRAMEs come with a significant overhead since loaded data is
rendered by the browser.
Looking at the two papers you refer to, it looks like they are
communicating by setting the fragment identifier on cross origin frames.
This hardly seems like a very desirable solution. I suspect that the
only reason they are doing this is that it works in browsers today.
Technically it seems like an extremely sup optimal solution. Both
efficiency wise and security wise.
> > The negative with sending cookies is that it opens up CSRF attack
> > opportunities when web sites say allow "*", which I expect would be
the
> > most common scenario. There are other approaches to sending data to a
> > 3rd party domain that are more secure and still achieve the same
result.
> > Once again, JSONRequest does not send cookies. I assume that Doug
> > Crockford assumes that authenticated information (if necessary) would
> > come from be sent with the payload (rather than via cookies), which
> > means that if domain FOO wants to upload data to domain BAR, then
domain
> > FOO's web page would require the user to someone enter their BAR
> > authentication information (which could be stored in FOO's cookies,
not
> > BAR's cookies). With such an approach, the user will be told by FOO's
> > web page that this web page needs his BAR account information, so the
> > user gets to opt-in to allowing the cross-site POST, instead of the
> > current approach in Access Control where cookies (potentially with
> > credentials) are always sent without the user being aware.
>
> Without sending cookies we can't satisfy requirement 12. Especially in
> combination with requirement 3.
Others have criticized the second half of my paragraph above, and I
agree with some of the criticisms of that second half, but the
fundamental issue is that cookies open up the possibility of CSRF
attacks, and my belief (along with Doug Crockford's) is that
cross-domain data services (Access Control or JSONRequest or whatever)
are possible without sending cookies, or at least *always* sending
cookies. Perhaps people have already thought of this, but wouldn't it be
better if cookies were not sent unless a prior OPTIONS request said that
the server wants to see the cookies; in other words, the server has to
explicitly opt-in to cookie transmission.
So your worry here is that someone will read enough of the spec to see
that then can enable cross site requests using OPTIONS, but not enough
to see that doing so will also give them requests that contain
authentication info?
While this is a valid concern, I think I would prefer to fix this by
providing clear information in the spec rather than adding complexity of
allowing cross site communication both with and without cookies.
> JSONRequest requires that I give my login data to the requesting site.
> That seems scary.
>
> > >Cookies and authentication information is already sent cross-site
for the
> > >HTML <img>, <script>, and <form> elements so this does not
introduce a new
> > >attack vector. It simply makes use of the Web.
> >
> > <img> and <script> only work with GET, so if a web server follows best
> > practices (i.e., only support POST when submitting data), then you
> > aren't subject to data submission attacks. There is no way to retrieve
> > data via <img>, so that doesn't allow data retrieval attacks. With
> > <script>, the only way to retrieve data is if the server supports
> > JSON-with-callback.
>
> This is exactly how access control works too. You can issue GET requests
> that include cookies and auth information to 3rd party sites, but you
> can't retrieve data by default. You can only retrieve data if the server
> explicitly allows it.
>
> > Because of these factors, I don't think <img> and
> > <script> should be used to say that the web is already vulnerable.
>
> I'm not sure what you mean by this. All we are saying is that it is
> already possible to issue GET requests that include cookies in a number
> of ways. Do you not agree with this?
>
> > It is true that <form> works with both GET and POST and does send
> > cookies to other domains, which means web site developers today indeed
> > need to protect against CSRF attacks via cross-domain use of <form>,
> > where CSRF protection is usually is accomplished by the server
actually
> > managing a session with the user where a random token is maintained
> > across the session (without storing it within cookies). Right now,
> > Access Control does not have a good mechanism for achieving
> > server-controlled session management with such a random token, largely
> > because it uses a client-side PEP approach. In fact, Access Control
gets
> > in the way of the way most server developers would implement CSRF
> > protection.
>
> Please elaborate as I don't understand what you mean here. If you think
> access control allows for any specific attacks please describe in detail
> the attack.
There was prior criticism of my paragraph above by Thomas. See
*http://lists.w3.org/Archives/Public/public-appformats/2008Jan/0191.html
and search for CSRF.*
>
> > >** Why can't cookies authentication information be provided by
the script
> > >author for the request?
> > >
> > >This would allow distrubted cookie / user credentials search.
> > >
> > >** Why is the client the policy enforcement point?
> > >
> > >The client already is the policy enforcement point for these
requests. The
> > >mechanism allows the server to opt-in to let the client expose
the data.
> > >Something it currently does not do and something which servers rely
> > upon the
> > >client not doing.
> >
> > This confuses "access control" with "policy enforcement point". Yes,
> > browsers today implement an access control policy that (currently)
> > prevents (among other things) XHR from talking to other domains and
> > prevents frames from different domains to see each other's DOM. But
> > "policy enforcement point" means deciding which particular users or
> > which particular domains should have the ability to retrieve data or
> > change data.
> >
> > With JSONRequest, the access control rules are loosened such that the
> > new mechanism (i.e., JSONRequest) is allowed to access other domains,
> > and therefore assumes that the policy enforcement point is the server
> > who receives the GET and POST requests. This approach makes more sense
> > and results in a simpler mechanism.
>
> Sure, saying 'anyone can access the data' is simpler. It also seems to
> open new attack vectors. Requirement 1 states that we must not allow new
> attack vectors.
Both Access Control and JSONRequest require the server to opt-in before
data can be retrieved or accepted. With Access Control, the server
inserts some control information into the HTTP headers or into a PI, but
then ACTUALLY SENDS THE DATA TO THE CLIENT, and then trusts that the
client will honor the allow/deny instructions.
This looks like it's true for JSONRequest too. It sends the data to the
client and then trusts that the client will check the content-type
header for the value "application/jsonrequest".
It is also true that both JSONRequest and access control allows the
server to make the call and not send the data at all if it doesn't want
to give access. In the case of JSONRequest the server can inspect
whatever authorization information is supplied in the request data, in
the case of access control the server can check the Referrer-Root header.
This amounts to "anyone
can access the data" since it is trivial to send any arbitrary HTTP
request from a malicious server (or modified browser).
That is true today too, no? Anyone can load any data from my bank site,
all they need to do is to get me to run a malicious server that knows my
password, or get me to run a modified browser.
Or am I misunderstanding you?
I argue that Access Control is dangerous because it provides the server
developer with a false sense of security. Server-side developers will
fall into the trap and think that by saying "allow foo.com", then only
foo.com can see the data, but in fact any malicious server can construct
an HTTP request to retrieve the data. I like the approach in JSONRequest
because it doesn't pretend that the client can be trusted to manage
access rights and forces the server developer to manage access within
the server.
Isn't there already that exact risk? Someone will put a file on their
server and think that only their own webpages can access it?
And isn't JSONRequest doing the same thing by saying that access is
denied unless the content type is "application/jsonrequest"?
Sure, saying "everything is open" removes a lot of concerns. But it
relies on that server administrators are aware that everything is open,
which I have even stronger doubts of.
The big risk here I would say is personal data. And a malicious server
can not reach that as it doesn't have my cookies or auth credentials.
> > >Note however that the server is in full control. Based on the
request
> > it can
> > >simply opt to return no data at all or not provide the
necessaryhandshake
> > >(in form of the Access-Control HTTP headers and <?access-control?>
> > >processing instructions.
> > >
> > >** Why does the mechanism do both black- and whitelisting?
>
> This is a result of requirements 4, 8 and 9 together.
I disagree that requirement 8 (or 9) needs to be or should be satisfied
within Access Control, and therefore are non-requirements. Nearly all
data retrieval will be managed by some procedural logic, such as PHP or
ASP. The server developer can put the list of black-listed or
white-listed domains into his PHP or ASP scripts, with the result that
you have a server-side PEP, as has been requested by some of the people
on this list. The server approach allows multiple decision approaches.
You can use black-listing, white-listing, or look for a magic key within
the request that only trusted parties could possibly know.
Well, I'm not sure what to tell you. These are the requirements that
we've received. 8 was part of the original voice-browser groups
requirements, and I admittedly wouldn't really have a hard time coming
up with use cases for it.
9 came out of the security review we did at mozilla. It isn't a
requirement based on use cases, but rather one to make the spec more secure.
> > >In case the server and documents hosted on the server are in
control by
> > >different people it is necessary that the server people are able to
> > override
> > >the document people (if the document wants to share access) and
vice versa
> > >(if the server wants to share access).
> > >
> >
> > I think that both whitelisting and blacklisting represent a major
> > weakness in the Access Control spec. I have yet to see important use
> > cases where it makes sense for a particular domain FOO to allow access
> > to a particular domain BAR. In the real world, when would
> > http://www.facebook.com ever list a particular domain to which it
would
> > provide access? As best as I can tell, the primary use cases are *
> > (i.e., allow everyone) and *.mydomain.com (i.e., allow access to
all of
> > your own subdomains). For simplicity reasons, the best way to go is to
> > drop whitelisting and blacklisting entirely and therefore just support
> > *, which is what JSONRequest does.
>
> I don't think that seems very far fetched that a set of servers want to
> collaborate. For example www.livejournal.com and user.livejournal.com
> might want to allow each other to read data. They are basically the same
> site, but use different domain names for various security reasons.
Yes, there are some cases where specific domains are exactly what is
needed, but whitelisting/blacklisting is the best approach in only a
small subset of scenarios, and when needed, can be accomplished by
leveraging server side mechanisms that exist today already, such as PHP.
Again, with JSONRequest, anyone can make any request, and the server
decides which requests to honor, perhaps using blacklisting/whitelisting
or maybe some other approach.
This makes things a lot more complicated for the content provider since
server side scripting is *required*.
> > >Access Control for Cross-site Requests USE CASES
> > >
> > >FOO in the scenarios below is a fictional person who lives in
Havana and
> > >likes playing with Web technology that isn't implemented anywhere.
> >
> > Of course, the "isn't implemented anywhere" is a bit of humor, but it
> > does point out a weakness with Access Control. It won't provide
value to
> > the world until it is available in a majority of deployed browsers.
> > Since MS and Apple are not participating in the discussion so far,
there
> > is a worry that their browser might never support Access Control. I
have
> > no insight into what Apple is thinking regarding Access Control, but I
> > hearing security concerns from people at MS about Access Control
within
> > discussion at OpenAjax. But let's suppose that Apple and MS do come
> > around and ultimately ship it, let's say in 3 years. It will then take
> > roughly another 3 years or so before the majority of deployed browsers
> > support the feature. On the other hand, JSONRequest can be implemented
> > in JavaScript and will work with today's browsers, and therefore the
> > community can use it immediately.
> >
> > What the above paragraph translates into is that I would like to see a
> > requirement that says something about it being highly desirable
that the
> > mechanism can run in today's browsers as they exist today (without
> > requiring the addition of a new browser plugin).
>
> You are arguing that we shouldn't design a new standard because the new
> standard doesn't work out of the box in all existing browsers. This
> would seem to limit progress a whole lot. If we followed that we would
> put most of W3C out of business, no?
>
> I've heard ample interest from all major browser vendors, with possible
> exception of microsoft, but I don't have as good personal connections
> with anyone on the IE team as I do with the other browser vendors.
I am saying that I believe it is possible to design a cross-domain data
access *API* that could be implemented natively in future browsers and
could be implemented in a reasonable manner via JavaScript in today's
browsers.
How can you implement the API without implementing its functionality?
The primary evidence that I have that this is feasible is that
Kris Zyp has done a pure JavaScript implementation of the JSONRequest
*API* that works with today's browsers, but obviously there will be at
least one feature that he can't accomplish with JavaScript.
If something is already implementable in todays browsers, why are we
here? Couldn't we just say that things are fine and cross site requests
are already done and works fine on the web?
Also, you can't implement the security mechanisms in JSONRequest in
todays browsers. Neither the parts that protect the 3rd party server
(enforcing the content type having to be application/jsonrequest), nor
the 1st party site (make sure that script can't execute in the security
context of the site).
So any implementation based on todays browsers that claims to implement
JSONRequest would seem outright dangerous!
But if
JSONRequest were to become the industry standard, then JavaScript
developers could use those APIs today by including a JSONRequest
JavaScript library, where that library would check to see if the
JSONRequest object is already in the browser, if so use it, else build
it. Sure, there are some issues to be addressed with the pure JavaScript
approach, but there is a hugely compelling benefit that the community
doesn't have to wait <n> years before critical mass of deployment occurs
(and maybe longer, with MS holding effective veto power).
If they used those APIs today they would both be restricting by the
functionality that JSONRequest provides, and they would not be protected
by the security mechanisms that do exist in JSONRequest.
However, if all they need is what is doable today, they should just go
ahead. No need to wait for neither an access-control implementation, nor
a JSONRequest implementation.
> > >* FOO owns test.example.org and test2.example.org. FOO uses XSLT
> > extensively
> > >on both domains and because FOO doesn't want to use a pre-processing
> > script
> > >to duplicate XSLT files he puts them all on test.example.org and
> > includes a
> > ><?access-control allow="test2.example.org"?> at the top of them.
> >
> > IMO opinion, the XSLT and XBL specs should simply say that user agents
> > should allow cross-site access, just like what happens today with CSS
> > and JavaScript. Don't need Access Control for that.
>
> This would seem to open very scary new attack vectors. Just because a
> spec is produced that says that it's ok to load new data types cross
> site doesn't mean that server administrators automatically protect
> themselves against that. Note that both XSLT and XBL can basically use
> any raw XML file. All XSLT needs is one magic attribute on the root
> element, and XBL might not even need that.
Bottom line: I don't buy it. XSLT and XBL should not have any impact on
how the cross-domain data retrieval feature should work.
If we said that XBL could be loaded cross site, what would stop anyone
from loading arbitrary XML data from any server? Such as a server behind
a firewall?
> > >* FOO has implemented the fictional OPEN DATA REST API on
test.example.org
> > >to store data so that services that help him keep track of bookmarks,
> > >friends, et cetera can store the info on FOO's domain instead of
their
> > own.
> > >This allows FOO to switch to any other service provider taking
his data
> > >easily with him. Using Access Control he enables
2del.icio.us.invalid and
> > >flickr2.invalid to access his data so they can store and
manipulate data.
> > >To keep other people from messing with his data the API only
works if
> > you're
> > >authenticated with test.example.org.
> >
> > I didn't fully understand the above workflow. This I understand:
FOO has
> > a web site at test.example.org that implements OPEN DATA REST API. But
> > how is it that del-icio.us or flickr would even know about
> > test.example.org to invoke the OPEN DATA REST APIs on that site?
And why
> > doesn't the web page at test.example.org simply invoke flickr or
> > del.icio.us APIs (probably using dynamic script tag with JSON today
> > using a particular API key) to retrieve the data and then upload it to
> > test.example.org? BTW - flickr does have a rich set of APIs today
(after
> > all, it's part of Yahoo), but all I could find for del.icio.us were a
> > small number of APIs that seemed to work only via server-side
> > facilities. (Maybe I am missing something.)
>
> The flow is that example.org inc advertises that they have implemented a
> public REST API that provide certain services. Other sites then use that
> API and build functionality on top of that.
>
> Such APIs have been announced by very many web vendors already. Here are
> some examples:
>
> www.flickr.com/services/api
> http://developer.yahoo.com/maps/simple/index.html
> www.google.com/apis/adwords/
> http://wiki.developers.facebook.com/index.php/API
I am well-aware of this industry trend and in fact mentioned the flickr
APIs in my comments. I was asking for more specifics about how data
flowed within this particular use case, and suggesting that the intended
result could be accomplished by a different approach today (i.e.,
without Access Control).
Sure, you could use other APIs too. But then it wouldn't be possible to
use normal HTTP APIs. See my initial note.
> > >* FOO signs up for the personaldata.example.com Web service where
you can
> > >enter all kinds of personal information, such as your address,
credit card
> > >information, et cetera. Every shopping site that has a contract with
> > >personaldata.example.com can then easily extract data out of it
as long as
> > >FOO is authenticated with personaldata.example.com which gives him a
> > better
> > >shopping experience.
> >
> > Yikes! No way would any ecommerce site leverage the browser and access
> > control for anything involving credit card numbers. If there were
such a
> > personaldata.example.com web service, then it would implement
> > server-to-server communications to deal with authentication and
passing
> > of secure information on a case-by-case basis, with various legal
> > documents among the various parties. Therefore, I do not think this
is a
> > valid use case.
>
> I agree that banking is more scary and would generally require very
> strong security checks.
Therefore I hope the above use case is either eliminated or modified.
I don't see the word 'bank' in the spec at all.
> > >* FOO enables cross-site access to his FOAF file and hopes
everyone will
> > >follow him so that the Tabulator http://www.w3.org/2005/ajar/tab
becomes
> > >easier to use/write/etc.
> >
> > This one needs more detail, such as would FOO allow everyone to
have GET
> > access, everyone to have POST access, or what? (Note that if
everyone is
> > given access, then there is no need for the complicated allow/deny
> > syntax for this particular use case. In fact, the only use case here
> > that might leverage allow/deny features is the OPEN DATA REST API, but
> > that one needs to be fleshed out some more.)
>
> Yep, this exact use case would not need allow/deny rules.
>
> > Each of the use cases needs a small write-up about how the given use
> > case is accomplished today (or is not possible today) and what
proposed
> > alternative technologies (e.g., JSONRequest) might be used instead of
> > access control to achieve the desired result.
>
> As I've stated above, and many times before, JSONRequest fails to meet
> many of our requirements. Such as 4, 5, 7, 8 in combination with 4, 9 i
> think, 10, 11, and 12. And possibly even 1.
Thanks for taking all of that time to respond, but I have to say that I
still remained unconvinced about lots of the decisions. Regarding
JSONRequest, I don't buy the arguments that have been cited for
discarding it, either because I don't agree with particular
requirements, or I disagree with the analysis that concludes that
JSONRequest is unsuitable. But don't get me wrong. I'm not a JSONRequest
zealot. It's just one proposal for how to make the world better in the
area of cross-domain data access. I am just saying that Doug Crockford
did a really good job thinking through the issues, particular those
related to security, and that I think what he came up with is a better
answer than what's in the current draft of Access Control. However, as I
have said previously, Access Control in its current form provides
positives and negatives (as done everything), and the world will deal
with those positives and negatives, but I just want to make sure that I
am on record as saying that my opinion is that there are different
approaches that would produce greater benefit to the community with
fewer negative consequences.
Duly noted.
Proposals are more then welcome. But if they don't satisfy the set of
requirements we have, I think we need to discuss the requirements first.
Best Regards,
Jonas Sicking