Saturday, October 25, 2003, 3:29:52 PM, Ilkka Priha wrote: [snip] > why not to apply the XML (and XHTML and WML as well) declaration instead > of a velocity specific one as most markups are based on XML: > > <?xml version="1.0" encoding="UTF-8" ?> [snip]
It can't be used for Vel. templates, as it would be bad if Vel. does not output <?xml ...?> as is... (and after all, Vel. templates are not XML). As of the possibility of automatic charset detection, the problem is with the non US-ASCII "compatible" charsets, as EBCDIC based charsets or UTF-16. XML charset detection works only because it knows that non-UTF-8/UTF-16 file start with '<', so 4C must mean that the file uses EBCDIC characters, and also FE FF and such must be BOM (nor FE nor FF nor 00 is '<' in any charsets). But in the case of Vel. templates, the first character can be anything. But still, a practical solution would be to read the file as ISO-8859-1, and if it starts with #encoding=Foo, then use charset Foo to re-decode the file. Of course, with this method, the special comment can't be detected if the file uses UTF-16 or some EBCDIC spawn, but in this case it just uses the default encoding as now, so you have lost nothing compared to the current situation. At least it works for ISO-8889-X, UTF-8, cpXXX, Shift_JIS, etc. These are the charsets almost everybody uses anyway. p.s. OK, it is possible to create an EBCDIC file that can be badly interpreted in ISO-8859-1 as "#encoding=...", but, well... there is no real chance for it. -- Best regards, Daniel Dekany --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
