Re: Why can't ap_send_error_response() count on charset?
On Tue, Aug 13, 2002 at 12:52:25PM -0400, Greg Ames sent those random bytes: > in the html. I am curious to hear what the W3C Validator people say. Well, my message to W3C generated a thread of ten emails. This is a short report of their toughts. 1 - There is no need to specify a meta charset in HTML documents if the charset is given in the Content-Type header. But there may be an additional complication: Some 404s may be in other encodings than iso-8859-1. In that case, the header would be wrong. As long as this is just for the built-in 'last resort' error message that doesn't change, it's okay. But in case it's tagged onto any arbitrary error message, it's a problem. (So with Greg's fix Apache should be fine - Carlo) BTW, a related problem is the directive 'AddDefaultCharset'. This adds a 'charset' parameter to *every* Content-Type that doesn't already have one. This means that if you have some gifs, they get served as Content-Type: image/gif; charset=foo. This is of quite useless. (About the AddDefaultCharset problem noted by Duerst) The Apache documentation implies that, but it isn't actually the case in my testing with Apache 1.3.26. The charset parameter only seems to be added for text/html and text/plain. It's not added for image/* or text/vnd.wap.wml. 2 - About the default HTML code provided for a 404: (Apache developers) should change to . is for XHTML/XML only, but they've specified HTML 2.0. 3 - Some of the W3C people thinks having an option 'validate error messages' in the validator form is a good idea, because they want to be able to validate all html. -- Carlo Perassi - http://www.linux.it/~carlo/ Do only what only you can do (Edsger Wybe Dijkstra: 1930-2002)
Re: Why can't ap_send_error_response() count on charset?
On Tue, Aug 13, 2002 at 11:06:57AM -0400, Greg Ames sent those random bytes: > Can you try it again with current cvs HEAD? I'm not familiar with the W3C > Validator test, but I would hope that if it saw a good http Content-Type header, > it wouldn't need the stuff in the html meta line. Me too but I found a problem/feature due to the validator so I just wrote the following email to the w3c validator team: /* Hi all the default "404 Not Found" page generated by the latest version of Apache HTTP Server (and the similar pages) doesn't pass the W3C Validator test ( it's a HTML 2.0 code shipped without a meta tag with charset value: try this foo page to see it: http://www.apache.org/doesntexist.html ) As I explain to the Apache developers ( see http://marc.theaimsgroup.com/?l=apache-httpd-dev&m=102918549709592&w=2 and http://marc.theaimsgroup.com/?l=apache-httpd-dev&m=102925143132691&w=2 ) it's trivial to change the Apache C code to generate W3C pages but they have technical reasons which don't permit to define a meta tag with charset definition... so some minutes ago, on the Apache CVS tree it's appeared a fix for a header problem, and as Greg Ames <[EMAIL PROTECTED]> said "I would hope that if (the Validator) saw a good http Content-Type header, it wouldn't need the stuff in the html meta line." Before trying the new Apache CVS code... I found a "problem": when your Validator found a "404" on the response header of the server, it doesn't parse the HTML provided anymore. see this session and, trust me, the validator doesn't parse the code below: # # BEGIN # carlo@voyager:~$ telnet www.apache.org 80 Trying 63.251.56.142... Connected to daedalus.apache.org. Escape character is '^]'. GET http://www.apache.org/doesntexist.html HTTP/1.0 HTTP/1.1 404 Not Found Date: Tue, 13 Aug 2002 15:41:38 GMT Server: Apache/2.0.40 (Unix) Content-Length: 287 Connection: close Content-Type: text/html; charset=iso-8859-1 404 Not Found Not Found The requested URL /doesntexist.html was not found on this server. Apache/2.0.40 Server at www.apache.org Port 80 Connection closed by foreign host. # # END # My question is: why don't you drive the Validator to parse the html code, even when the return code is different from 200? If you do like this, Apache team will be able to check if the fix on the code which produces the header of the response is enough to pass the test. Thank you. */ So I (we) should wait their answer. Thanks. -- Carlo Perassi - http://www.linux.it/~carlo/ Do only what only you can do (Edsger Wybe Dijkstra: 1930-2002)
Why can't ap_send_error_response() count on charset?
Hi all. In modules/http/http_protocol.c the comment say ap_send_error_response is used for any response that can be generated by the server from the request record. This includes all [snip] messages that have not been redirected to another handler via the ErrorDocument feature. On line 2331 I read: /* can't count on a charset filter being in place here, * so do ebcdic->ascii translation explicitly (if needed) */ It's trivial to add on line 2336 to ap_rvputs_proto_in_ascii() a string like or so... but the comment about say "can't count on a charset". Anyway... with the actual code, the html generated by ap_send_error_response can't pass the W3C Validator test (with the missing meta line it would be ok). I'd like to see the html generated by ap_send_error_response to pass the W3C Validator test in the default configuration (say without using external html files for 404 and so on). The patch is trivial but I don't understand why (we) "can't count on a charset filter being in place here". Thank you. -- Carlo Perassi - http://www.linux.it/~carlo/ Do only what only you can do (Edsger Wybe Dijkstra: 1930-2002)