--- "à¤à¤¶à¥à¤· शà¥à¤à¥à¤²à¤¾ \"Wah Java !!\"" <[EMAIL PROTECTED]> wrote:
> > I am not sure which HTML specification you are looking at but the > W3 > > page says quite opposite of what you are claiming > > I'm also looking at the same HTML v. 4.01 specification. > > > http://www.w3.org/TR/html4/charset.html > > above URL also says, this: > > -- begin quote -- > To sum up, conforming user agents must observe the following > priorities when > determining a document's character encoding (from highest priority to > lowest): > 1. An HTTP "charset" parameter in a "Content-Type" field. > 2. A META declaration with "http-equiv" set to "Content-Type" and a > value set > for "charset". > 3. The charset attribute set on an element that designates an > external resource. So a Meta declaration will override the Content-Type header since ContentType could possibly be a servier configuration whereas the META tag is controlled by the person maintaining the page who idealy should be a better judge of what the document is actually in. > In addition to this list of priorities, the user agent may use > heuristics and > user settings. For example, many user agents use a heuristic to > distinguish the > various encodings used for Japanese text. Also, user agents typically > have a > user-definable, local default character encoding which they apply in > the absence > of other indicators. I believe this is very iffy and behaviour may change with even a small patch to whatever browser you are using - basically no standards on this behaviour. > This kind of interaction is great, but it is not the only kind of > interaction we > have. I mean, it works when you have document in multiple encodings, > and > depending on user agent preferences, you respond. And, also there has > to be > someway, by which we can inform our webserver that document.html, > document.utf8.html, document.iso-8859-1.html are same docs in > different > encodings. But, my thing is (explained with an example): What you are asking for is already implemented in the apache web server http://httpd.apache.org/docs/1.3/mod/mod_mime.html > > The majority of the problem starts now. The standards say that the > > content-type specified by the server is a recommendation or a > guideline > > and not an overriding instruction. The browser is supposed to > accept > > the data in good faith but is supposed to use it's own judegement > in > > handling the data. This is the reason why all browser give you an > > option to change the charset being used to render the current page. > > BTW, which standards says it and where ?? Cant recall specific standards but a nice discussion on similar topic is available here http://ppewww.ph.gla.ac.uk/~flavell/www/content-type.html Do note I couldnt locate any reference to the fact that Content-Type can not be overridden at the broswer end. > So, in other words, browser should not trust server. In a hostile network I would prefer not to. I am not sure of the specifics but bottom line is it is a matter of trust - would I trust a unknown server to decide how I treat their data or should I be the best judge of it. I would rather let applications I trust decide what to do with anything. > http://www.w3.org/Protocols/rfc2616/rfc2616-sec7.html#sec7.2.1 > -- begin quote -- > Any HTTP/1.1 message containing an entity-body SHOULD include a > Content-Type > header field defining the media type of that body. If and only if the > media type > is not given by a Content-Type field, the recipient MAY attempt to > guess the > media type via inspection of its content and/or the name extension(s) > of the URI > used to identify the resource. If the media type remains unknown, the > recipient > SHOULD treat it as type "application/octet-stream" > -- end quote -- I think we have deviated a bit from Charset to Content-Type Charset is not as strictly enforced as Content-Type. Yes there are sufficient broken webservers out there who say rpm is a real media file to give me headaches. Coming back to http://www.w3.org/TR/html4/charset.html My interpretation is 1. Check for Content-Type use it if available. Go to item 2 for text contents 2. Check for META tag use it to over ride server side Content-Type 3. Check for element charset and override the charset for the specific element Maybe my interpretation is wrong but I think that is what happens currently. Also as I mentioned before we are discussing two different topics here. Content-Type is a superset of charset as in in most scenarios Content-Type is sent without a charset involved which is why META tags play a lot of role. Mithun __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com _______________________________________________ ilugd mailinglist -- ilugd@lists.linux-delhi.org http://frodo.hserus.net/mailman/listinfo/ilugd Archives at: http://news.gmane.org/gmane.user-groups.linux.delhi http://www.mail-archive.com/ilugd@lists.linux-delhi.org/