Steven Atreju, Wed, 18 Jul 2012 13:40:30 +0200: > Except that the internet is almost unusable without cookies > and scripting, lynx(1) works very well, too, if the ncursesw > library is linked against (and the terminal font supports > Unicode characters). Funny that it writes garbage for > > |<html><body><p>ä.ü.ö.</p></body></html> > > but uses UTF-8 by default for > > |<html><body><p>ä.ü.ö.</p></body></html>
Wow, a command line tool that breaks with all you have said about Unix tools, no? :-) It would be perfectly in line with HTML5 if Lynx, with or without linking against ncurses, sniffed the first, BOM-less instance correctly too. However, so far, Chrome seems like the only browser to do so by default. > Hypertext offers a lot of possibilities to declare the charset, > and until then an agnostic 8-bit parser will do fine except > for multioctet charsets. One should perhaps not care about bugs ... But for Lynx, in the version I checked last (probably not linked to ncurses), then it did not understand HTML5's new <meta charset="FOO"> any better than it understood the BOM. It only understood <meta http-equiv=Content-Type content=FOO>. So, since dropping the new <meta> element is not really an option, then, to always also the HTTP header on the server, is the absolutely safest thing ... -- Leif H Silli

