Picot Chappell <[EMAIL PROTECTED]> writes: > Why doesn't wget assume that files, which don't declare content > type, are text/html files?
Good question. I don't know, perhaps such brokenness never occurred to me. And I don't remember anyone reporting it until now. > I'm looking into patching http.c, so that if type isn't defined it > gets set to text/html. Has this been done for 1.8.1 already? If > so, can someone pass that patch along to me? > > Also, if I do this, will it cause horrible wget hiccups? I don't think it will make a difference, except improve user experience in the case that you describe. Correctly written pages will not be affected adversely, and that's what truly matters. Here is a patch that should implement what you need. Please let me know if it works for you. 2002-04-16 Hrvoje Niksic <[EMAIL PROTECTED]> * http.c (gethttp): If Content-Type is not given, assume text/html. Index: src/http.c =================================================================== RCS file: /pack/anoncvs/wget/src/http.c,v retrieving revision 1.90 diff -u -r1.90 http.c --- src/http.c 2002/04/14 05:19:27 1.90 +++ src/http.c 2002/04/16 00:14:57 @@ -1308,10 +1308,12 @@ } } - if (type && !strncasecmp (type, TEXTHTML_S, strlen (TEXTHTML_S))) + /* If content-type is not given, assume text/html. This is because + of the multitude of broken CGI's that "forget" to generate the + content-type. */ + if (!type || 0 == strncasecmp (type, TEXTHTML_S, strlen (TEXTHTML_S))) *dt |= TEXTHTML; else - /* We don't assume text/html by default. */ *dt &= ~TEXTHTML; if (opt.html_extension && (*dt & TEXTHTML))