2008/6/12 Sylvain Hellegouarch <[EMAIL PROTECTED]>: > >> Can anyone confirm for me what the behaviour should be if someone >> includes a newline in the value of a WSGI response header? >> >> CGI specification would seem to disallow it and thus WSGI adapter >> should by rights possibly produce an error if user code does it. >> >> At the moment I know of no WSGI adapter implementation which validates >> whether a newline appears in the value of a WSGI response header. For >> many WSGI adapters this means that a header of: >> >> Key1: "Value1\r\nKey2: Value2" >> >> will actually translate into two separate headers being sent back to >> client. >> >> For a header of: >> >> Key3: "Value3a\r\nValue3b" >> >> in a WSGI adapter which simply passes things through, the client would >> get an invalid header line, which in general it would ignore. If >> however this was generated when hosted with a CGI-WSGI adapter, for >> Apache at least, Apache would generate a 500 error itself due to >> detected a header line of invalid format. >> >> Thus, is an embedded newline in value invalid? Would it be reasonable >> for a WSGI adapter to flag it as an error? >> > > I might be reading the spec wrong but it doesn't seem to be forbidden by > RFC 2616. > > Section 4.2 says: > >> Any LWS that occurs between field-content MAY be replaced with a single > SP before interpreting the field value or forwarding the message > downstream. > > Then a look at the definition of separators shows us that SP is a valid > separator. > > Since section 2.1 tells: > >> Except where noted otherwise, linear white space (LWS) can be included > between any two adjacent words (token or quoted-string), and between > adjacent words and separators, without changing the interpretation of a > field. > > It sounds to me that this is a valid construct but a WSGI adapter might > consider converting those CRLF into simple SP as said in 2.1 again: > >> A recipient MAY replace any linear white space with a single SP before > interpreting the field value or forwarding the message downstream.
A LWS is: LWS = [CRLF] 1*( SP | HT ) Ie, not just a single CRLF, but a CRLF followed by a space or tab. Thus, can't just replace CRLF only with a space. Anyway, the wording of my question and reference to CGI was a bit wrong, as WSGI response headers are probably more governed by HTTP RFC. To clarify, what we really have is two cases, the first is return of a value with a valid LWS as specified by HTTP RFC. If the WSGI adapter is mapping direct to HTTP, then it can pass it straight through. If however the WSGI adapter hosts on top a interface with CGI like semantics, then it should translate LWS to single space as described. The second case is an embedded CRLF which isn't followed by space or tab and thus isn't a LWS. This is the case which causes problems and am asking whether it should be detected and flagged as an errornous response. Graham _______________________________________________ Web-SIG mailing list [email protected] Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
